Diagnosis and treatment of skeletal degeneration conditions

ABSTRACT

This invention relates to methods and compositions for the diagnosis and treatment of conditions that affect skeletal growth. More specifically, the invention relates to isolated molecules that can be used to promote chondrogenesis. These molecules, therefore, are useful in the treatment of various disorders that affect the skeleton, including bone and cartilage degeneration conditions.

RELATED APPLICATIONS

[0001] This application claims priority under 35 USC §119(e) from U.S.Provisional Patent Application Serial No. 60/274,980, filed on Mar. 12,2002, entitled DIAGNOSIS AND TREATMENT OF SKELETAL DEGENERATIONCONDITIONS. The contents of the provisional application are herebyexpressly incorporated by reference.

GOVERNMENT SUPPORT

[0002] The work resulting in this invention was supported in part by NIHGrant No. AR44873. Accordingly, the U.S. Government may therefore beentitled to certain rights in the invention.

FIELD OF THE INVENTION

[0003] This invention relates to methods and compositions for thediagnosis and treatment of conditions that affect skeletal growth. Morespecifically, the invention relates to isolated molecules that can beused to promote chondrogenesis. These molecules, therefore, are usefulin the treatment of various disorders that affect the skeleton,including cartilage degeneration conditions.

BACKGROUND OF THE INVENTION

[0004] Articular cartilage, the thin, fragile tissue layer covering theends of bones, allows healthy joints to move freely and without pain.Many arthritic diseases and many degrees of trauma can, however, causedestruction or deterioration of this fragile layer, leading to pain,joint stiffness, and even crippling. A common belief has been that thisfragile surface, once lost, could never be restored. Attempts made inthe past to regenerate or otherwise repair articular cartilage have beenunsuccessful, thereby directing medical science to the development ofsubstitutes (such as implants), abandoning the potential forregeneration.

[0005] There exists a continued need for the development of alternativemethods of cartilage regeneration and for alleviating the painassociated with cartilage degeneration conditions.

SUMMARY OF THE INVENTION

[0006] This invention provides methods and compositions for thediagnosis and treatment of congenital and/or acquired conditionsaffecting skeletal (cartilaginous/bone) growth. More specifically, wehave identified a number of genes that are modulated in mesenchymalcells when the cells are cultured in a system that simulatesphysiological skeletal growth conditions. It has been discovered thatsuch gene modulation leads to the acquirement of a chondroblasticphenotype by the mesenchymal cells (i.e., to cartilage/bone formation).In view of these discoveries, it is believed that the molecules of thepresent invention can be used to promote cartilage/bone formation, andin particular, to treat congenital and/or acquired conditions thataffect the skeleton, such as cartilaginous tissue degenerationconditions that include all forms of arthritis such as osteoarthritis,rheumatoid arthritis, osteochondrosis, and the like. Additionally,methods for using these molecules in the diagnosis of any of theforegoing skeletal degeneration conditions, are also provided.

[0007] Furthermore, methods for using these molecules in vivo or invitro for the purpose of modulating mesenchymal cell differentiation,methods for treating conditions associated with skeletal degeneration,and compositions useful in the preparation of therapeutic preparationsfor the treatment of the foregoing conditions, are also provided.

[0008] The present invention thus involves, in several aspects,polypeptides modulating mesenchymal cell differentiation, isolatednucleic acids encoding those polypeptides, functional modifications andvariants of the foregoing, useful fragments of the foregoing, as well astherapeutics and diagnostics relating thereto.

[0009] According to one aspect of the invention, isolated nucleic acidmolecules are provided. Such nucleic acid molecules include: (a) anucleic acid molecule which hybridizes under stringent conditions to amolecule consisting of a nucleotide sequence set forth as SEQ ID NO:1-11and which code for a polypeptide that induces differentiation of amesenchymal cell, (b) nucleic acid molecules that differ from thenucleic acid molecules of (a) in codon sequence due to the degeneracy ofthe genetic code, and (c) complements of (a) or (b). In certainembodiments, the isolated nucleic acid molecule comprises the nucleotidesequence set forth as SEQ ID NO:1-11. The invention in another aspectprovides an isolated nucleic acid molecule selected from the groupconsisting of (a) unique fragments of a nucleotide sequence set forth asSEQ ID NO:1-11, and (b) complements of (a), provided that a uniquefragment of (a) includes a sequence of contiguous nucleotides which isnot identical to any known sequence as of the filing date of the instantapplication.

[0010] In one embodiment, the sequence of contiguous nucleotides isselected from the group consisting of (1) at least two contiguousnucleotides nonidentical to the sequence group, (2) at least threecontiguous nucleotides nonidentical to the sequence group, (3) at leastfour contiguous nucleotides nonidentical to the sequence group, (4) atleast five contiguous nucleotides nonidentical to the sequence group,(5) at least six contiguous nucleotides nonidentical to the sequencegroup, and (6) at least seven contiguous nucleotides nonidentical to thesequence group.

[0011] In another embodiment, the fragment has a size selected from thegroup consisting of at least: 8 nucleotides, 10 nucleotides, 12nucleotides, 14 nucleotides, 16 nucleotides, 18 nucleotides, 20,nucleotides, 22 nucleotides, 24 nucleotides, 26 nucleotides, 28nucleotides, 30 nucleotides, 40 nucleotides, 50 nucleotides, 75nucleotides, 100 nucleotides, 200 nucleotides, 1000 nucleotides andevery integer length therebetween.

[0012] According to another aspect, the invention provides expressionvectors, and host cells transformed or transfected with such expressionvectors, comprising the nucleic acid molecules described above.

[0013] According to another aspect of the invention, an isolatedpolypeptide is provided. The isolated polypeptide is encoded by theforegoing nucleic acid molecules of the invention. In some embodiments,the isolated polypeptide is encoded by the nucleic acid of SEQ ID NO:11,giving rise to a polypeptide having the sequence of SEQ ID NO:12 thatinduces mesenchymal cell differentiation. In other embodiments, theisolated polypeptide may be a fragment or variant of the foregoing ofsufficient length to represent a sequence unique within the humangenome, and identifying with a polypeptide that induces mesenchymal celldifferentiation, provided that the fragment includes a sequence ofcontiguous amino acids which is not identical to any sequence known asof the filing date of the instant application. In another embodiment,immunogenic fragments of the polypeptide molecules described above areprovided. The immunogenic fragments may or may not induce mesenchymalcell differentiation.

[0014] According to another aspect of the invention, isolated bindingpolypeptides are provided which selectively bind a polypeptide encodedby the foregoing nucleic acid molecules of the invention. Preferably theisolated binding polypeptides selectively bind a polypeptide whichcomprises the sequence of SEQ ID NO:12, or fragments thereof. Inpreferred embodiments, the isolated binding polypeptides includeantibodies and fragments of antibodies (e.g., Fab, F(ab)₂, Fd andantibody fragments which include a CDR3 region which binds selectivelyto the polypeptide of SEQ ID NO:12). In certain embodiments, theantibodies are human. In some embodiments, the antibodies are monoclonalantibodies. In one embodiment, the antibodies are polyclonal antisera.In further embodiments, the antibodies are humanized. In yet furtherembodiments, the antibodies are chimeric. According to a further aspectof the invention, a method for determining the level of SEQ ID NO:1-11expression in a subject, is provided. The method involves measuringexpression of SEQ ID NO:1-11 in a test sample from a subject todetermine the level of SEQ ID NO:1-11 expression in the subject. Incertain embodiments, the measured SEQ ID NO:1-11 expression in the testsample is compared to SEQ ID NO:1-11 expression in a control containinga known level of SEQ ID NO:1-11 expression. Expression is defined as SEQID NO:1-11 mRNA expression, expression of a polypeptide encoded by SEQID NO:1-11, or mesenchymal cell differentiation induction activity asdefined elsewhere herein. Various methods can be used to measureexpression. Preferred embodiments of the invention include PCR andNorthern blotting for measuring mRNA expression, monoclonal antibodiesor polyclonal antisera against polypeptides encoded by SEQ ID NO:1-11 asreagents to measure polypeptide expression, as well as methods formeasuring mesenchymal cell differentiation induction activity.

[0015] In certain embodiments, test samples such as biopsy samples, andbiological fluids such as blood, are used as test samples. SEQ IDNO:1-11 expression in a test sample of a subject is compared to SEQ IDNO:1-11 expression in control.

[0016] According to another aspect of the invention, a method foridentifying an agent useful in modulating mesenchymal celldifferentiation induction activity of a molecule, is provided. Themethod involves: (a) contacting a molecule having mesenchymal celldifferentiation induction activity with a candidate agent, (b) measuringmesenchymal cell differentiation induction activity of the molecule, and(c) comparing the measured mesenchymal cell differentiation inductionactivity of the molecule to a control to determine whether the candidateagent modulates mesenchymal cell differentiation induction activity ofthe molecule, wherein the molecule is a nucleic acid molecule selectedfrom the group consisting of SEQ ID NO:1-11, and 13-66, or an expressionproduct thereof. In certain embodiments, the control is mesenchymal celldifferentiation induction activity of the molecule measured in theabsence of the candidate agent.

[0017] According to still another aspect of the invention, a method ofdiagnosing a condition characterized by aberrant expression of a nucleicacid molecule or an expression product thereof, is provided. The methodinvolves: (a) contacting a biological sample from a subject with anagent, wherein said agent specifically binds to said nucleic acidmolecule, an expression product thereof, or a fragment of an expressionproduct thereof, and (b) measuring the amount of bound agent anddetermining therefrom if the expression of said nucleic acid molecule orof an expression product thereof is aberrant, aberrant expression beingdiagnostic of the condition, wherein the nucleic acid molecule is atleast one nucleic acid molecule selected from the group consisting ofSEQ ID NO:1-11, and 13-66. In certain embodiments, the nucleic acidmolecule may be at least two, at least three, at least four, or even atleast five nucleic acid molecules, each selected from the groupconsisting of SEQ ID NO:1-11, and 13-66. In some embodiments, thecondition is a cartilaginous tissue degeneration condition that includesall forms of arthritis such as osteoarthritis, rheumatoid arthritis,osteochondrosis, and the like. In important embodiments, the conditionis osteoarthritis.

[0018] According to still another aspect of the invention, a method fordetermining regression, progression or onset of a cartilaginous tissuedegeneration condition in a subject characterized by aberrant expressionof a nucleic acid molecule or an expression product thereof, isprovided. The method involves monitoring a sample from a patient, for aparameter selected from the group consisting of (i) a nucleic acidmolecule selected from the group consisting of SEQ ID NO:1-11, and13-66, (ii) a polypeptide encoded by the nucleic acid, (iii) a peptidederived from the polypeptide, and (iv) an antibody which selectivelybinds the polypeptide or peptide, as a determination of regression,progression or onset of said cartilaginous tissue degeneration conditionin the subject. In some embodiments, the sample is a biological fluid ora tissue as described in any of the foregoing embodiments. In certainembodiments, the step of monitoring comprises contacting the sample witha detectable agent selected from the group consisting of (a) an isolatednucleic acid molecule which selectively hybridizes under stringentconditions to the nucleic acid molecule of (i), (b) an antibody whichselectively binds the polypeptide of (ii), or the peptide of (iii), and(c) a polypeptide or peptide which binds the antibody of (iv). Theantibody, polypeptide, peptide, or nucleic acid can be labeled with aradioactive label or an enzyme. In further embodiments, the methodfurther comprises assaying the sample for the peptide. In still furtherembodiments, monitoring the sample occurs over a period of time.

[0019] According to another aspect of the invention, a kit is provided.The kit comprises a package containing an agent that selectively bindsto any of the foregoing novel isolated nucleic acids, or expressionproducts thereof, and a control for comparing to a measured value ofbinding of said agent to said novel isolated nucleic acids, orexpression products thereof. In some embodiments, the control is apredetermined value for comparing to the measured value. In certainembodiments, the control comprises an epitope of the expression productof any of the foregoing novel isolated nucleic acids. In one embodiment,the kit further comprises a second agent that selectively binds any ofthe foregoing novel isolated nucleic acids, or expression productsthereof, and a control for comparing to a measured value of binding ofsaid second agent to any of the foregoing novel isolated nucleic acids,or expression products thereof.

[0020] According to a further aspect of the invention, a method fortreating a cartilaginous tissue degeneration condition in a subject isprovided. The method involves administering to a subject in need of suchtreatment an agent that modulates expression of a molecule selected fromthe group consisting of SEQ ID NO:1-67, in an amount effective to treatthe cartilaginous tissue degeneration condition. In certain embodiments,the method further comprises co-administering an agent known to inhibitcartilaginous/bone tissue degeneration, such as an osteogenic protein(including Bone Morphogenetic Proteins—BMPs), Insulin-like Growth Factor(IGF), Transforming Growth Factor-β (TGF-β), and proteoglycans.

[0021] According to one aspect of the invention, a method for treating asubject to reduce the risk of a cartilaginous tissue degenerationcondition developing in the subject is provided. The method involvesadministering to a subject who is known to express decreased levels of amolecule selected from the group consisting of SEQ ID NO:1-67, an agentfor reducing the risk of cartilaginous tissue degeneration condition inan amount effective to lower the risk of the subject developing a futurecartilaginous tissue degeneration condition, wherein the agent is knownto inhibit cartilaginous/bone tissue degeneration, such as an osteogenicprotein (including Bone Morphogenetic Proteins—BMPs), Insulin-likeGrowth Factor (IGF), Transforming Growth Factor-β (TGF-β), andproteoglycans, or an agent that modulates expression of a moleculeselected from the group consisting of consisting of SEQ ID NO:1-67.According to one aspect of the invention, a method for identifying acandidate agent useful in the treatment of a cartilaginous tissuedegeneration condition, is provided. The method involves determiningexpression of a set of nucleic acid molecules in a cell of mesenchymalorigin, cartilaginous tissue, skin and/or bone marrow tissue, underconditions which, in the absence of a candidate agent, permit a firstamount of expression of the set of nucleic acid molecules, wherein theset of nucleic acid molecules comprises at least one nucleic acidmolecule selected from the group consisting of SEQ ID NO:1-11, and13-66, contacting the cell of mesenchymal origin, cartilaginous tissue,skin and/or bone marrow tissue with the candidate agent, and detecting atest amount of expression of the set of nucleic acid molecules, whereinan increase in the test amount of expression in the presence of thecandidate agent relative to the first amount of expression indicatesthat the candidate agent is useful in the treatment of the cartilaginoustissue degeneration condition. In certain embodiments, the cartilaginoustissue degeneration condition includes all forms of arthritis such asosteoarthritis, rheumatoid arthritis, osteochondrosis, and the like. Inimportant embodiments, the condition is osteoarthritis. In someembodiments, the set of nucleic acid molecules comprises at least twonucleic acid molecules, each selected from the group consisting of SEQID NO:1-11, and 13-66.

[0022] According to another aspect of the invention, a pharmaceuticalcomposition is provided. The composition includes an agent comprising anisolated nucleic acid molecule selected from the group consisting of SEQID NO:1-11, and 13-66, or an expression product thereof, in apharmaceutically effective amount to treat a cartilaginous tissuedegeneration condition, and a pharmaceutically acceptable carrier. Insome embodiments, the agent is an expression product of the isolatednucleic acid molecule selected from the group of SEQ ID NO:1-11, and13-66. In certain embodiments, the cartilaginous tissue degenerationcondition includes all forms of arthritis such as osteoarthritis,rheumatoid arthritis, osteochondrosis, and the like.

[0023] According to a further aspect of the invention, methods forpreparing medicaments useful in the treatment of a cartilaginous tissuedegeneration condition are provided.

[0024] According to still another aspect of the invention, a solid-phasenucleic acid molecule array, is provided. The array consists essentiallyof a set of nucleic acid molecules, expression products thereof, orfragments thereof, each nucleic acid molecule selected from the groupconsisting of SEQ ID NO:1-11, and 13-66, fixed to a solid substrate. Insome embodiments, the solid-phase array further comprises at least onecontrol nucleic acid molecule. In certain embodiments, the set ofnucleic acid molecules comprises at least one, at least two, at leastthree, at least four, or even at least five nucleic acid molecules, eachselected from the group consisting of SEQ ID NO:1-11, and 13-66.

[0025] According to still another aspect of the invention, a device isprovided. The device comprises a material surface coated with an amountof an agent of the invention (i.e. an agent having mesenchymal celldifferentiation induction activity). The amount of the agent iseffective to induce mesenchymal cell differentiation in the cells ofmesenchymal origin present in the tissue to which the implantable deviceis to be implanted. In certain embodiments, the material surface is partof an implant. The material comprising the implant may be syntheticmaterial or organic tissue material. Important agents, cell-types, andso on, are as described elsewhere herein.

[0026] According to a further aspect of the invention, methods forpreparing medicaments useful in the treatment of a cartilaginous tissuedegeneration condition, are provided.

[0027] These and other objects of the invention will be described infurther detail in connection with the detailed description of theinvention.

Brief Description of the Sequences

[0028] SEQ ID NO:1 is the partial nucleotide sequence of the human DF-1cDNA (RDA2).

[0029] SEQ ID NO:2 is the partial nucleotide sequence of the human DF-2cDNA (RDA10).

[0030] SEQ ID NO:3 is the partial nucleotide sequence of the human DF-3cDNA (RDA11).

[0031] SEQ ID NO:4 is the partial nucleotide sequence of the human DF-4cDNA (RDA30).

[0032] SEQ ID NO:5 is the partial nucleotide sequence of the human DF-5cDNA (RDA31).

[0033] SEQ ID NO:6 is the partial nucleotide sequence of the human DF-6cDNA (RDA35A).

[0034] SEQ ID NO:7 is the partial nucleotide sequence of the human DF-7cDNA (RDA38).

[0035] SEQ ID NO:8 is the partial nucleotide sequence of the human DF-8cDNA (RDA52).

[0036] SEQ ID NO:9 is the partial nucleotide sequence of the human DF-9cDNA (RDA86B).

[0037] SEQ ID NO:10 is the partial nucleotide sequence of the humanDF-10 cDNA (RDA90D).

[0038] SEQ ID NO:11 is the partial nucleotide sequence of the humanDF-11 cDNA (RDA 15).

[0039] SEQ ID NO:12 is the predicted amino acid sequence of thetranslation product of human DF-11 cDNA (SEQ ID NO:11).

[0040] SEQ ID NOs:13-66 are the nucleotide sequences of known genesinduced in mesenchymal cells according to the present invention.

[0041] SEQ ID NO:67 is the amino acid sequence ofAminoPhospholipid-transporting ATPase (ATP10C), its expression inducedin mesenchymal cells according to the present invention.

[0042] SEQ ID NOs:68-79 are various oligonucleotide sequences used inthe present invention.

BRIEF DESCRIPTION OF THE DRAWINGS

[0043]FIG. 1 depicts a kit embodying features of the present invention.

[0044]FIG. 2 shows a schematic of an experimental design forrepresentational difference analysis.

[0045]FIG. 3 shows bar graphs depicting gene expression levels of genesknown to be expressed in cartilage [type XI collagen (COL11A1), α-11integrin, and FGF2], as well as of aggrecan (an abundant cartilageextracellular matrix gene), normalized to G3PDH.

DETAILED DESCRIPTION OF THE INVENTION

[0046] The invention involves the discovery of a number of genes thatare upregulated in mesenchymal cells when the mesenchymal cells arecultured in a system that simulates physiological skeletal (bone and/orcartilaginous) growth conditions. It has been discovered that suchupregulation leads, unexpectedly, to the acquirement of a chondroblasticphenotype by the mesenchymal cells (i.e., to cartilage/bone formation).In view of these discoveries, it is believed that the molecules of thepresent invention can be used to promote cartilage/bone formation, andin particular, to treat conditions that affect the skeleton, such ascartilaginous tissue degeneration conditions that include all forms ofarthritis such as osteoarthritis, rheumatoid arthritis, osteochondrosis,and the like. Additionally, methods for using these molecules in thediagnosis of any of the foregoing skeletal degeneration conditions, arealso provided.

[0047] Furthermore, methods for using these molecules in vivo or invitro for the purpose of modulating mesenchymal cell differentiation,methods for treating conditions associated with skeletal degeneration,and compositions useful in the preparation of therapeutic preparationsfor the treatment of the foregoing conditions, are also provided.

[0048] “Upregulated,” as used herein, refers to increased expression ofa gene and/or its encoded polypeptide. Increased expression refers toincreasing (i.e., to a detectable extent) replication, transcription,and/or translation of any of the nucleic acids of the, invention (SEQ IDNO:1-11, or 13-66), since upregulation of any of these processes resultsin concentration/amount increase of the polypeptide encoded by the gene(nucleic acid). Conversely, downregulation or decreased expressionrefers to decreased expression of a gene and/or its encoded polypeptide.The upregulation or downregulation of gene expression can be directlydetermined by detecting an increase or decrease, respectively, in thelevel of mRNA for the gene, or the level of protein expression of thegene-encoded polypeptide, using any suitable means known to the art,such as nucleic acid hybridization or antibody detection methods,respectively, and in comparison to controls. Upregulation ordownregulation of gene expression can also be determined indirectly bydetecting a change in mesenchymal cell differentiation inductionactivity of the gene.

[0049] The culture system used herein that simulates physiologicalskeletal (bone and/or cartilaginous) growth conditions, is a system thatwe previously developed, and is described in detail in U.S. Pat. No.5,656,492, to Glowacki et. al., entitled “Cell Induction Device.” Forthe specific conditions used in the identification of the various genesof the present invention, see under Examples section.

[0050] “Mesenchymal cell differentiation induction activity” refers tothe ability of a molecule to induce differentiation of a mesenchymalcell to a chondroblast. Such activity can be determined using, forexample, standard tests known in the art (e.g., expression of type IIcollagen and/or aggrecan molecules by cells of the chondroblasticphenotype,—see also Examples section).

[0051] A “molecule,” as used herein, embraces both “nucleic acids” and“polypeptides.” The molecules of the present invention (e.g., SEQ IDNOs:1-67) are capable of inducing mesenchymal cell differentiation bothin vivo and in vitro.

[0052] “Expression,” as used herein, refers to nucleic acid and/orpolypeptide expression, as well as to activity of the polypeptidemolecule (e.g., mesenchymal cell differentiation induction activity ofthe molecule).

[0053] A “cell of mesenchymal origin” as used herein refers to a cellthat has been generated as a result of the differentiation of apluripotential cell(s) of the mesenchyme (tissue giving rise to allconnective tissues, including cartilage). Such pluripotential cell ofthe mesenchyme includes pluripotent stem cells and committed progenitorcells.

[0054] As used herein, a subject is a mammal or a non-human mammal. Inall embodiments human nucleic acid and polypeptide molecules, and humansubjects are preferred.

[0055] One aspect of the invention involves the cloning of cDNAsencoding polypeptides with mesenchymal cell differentiation inductionactivity.

[0056] The invention involves in another aspect isolated polypeptides,the cDNAs encoding these polypeptide, functional modifications andvariants of the foregoing, useful fragments of the foregoing, as well asdiagnostics and therapeutics relating thereto.

[0057] As used herein with respect to nucleic acids, the term “isolated”means: (i) amplified in vitro by, for example, polymerase chain reaction(PCR); (ii) recombinantly produced by cloning; (iii) purified, as bycleavage and gel separation; or (iv) synthesized by, for example,chemical synthesis. An isolated nucleic acid is one which is readilymanipulated by recombinant DNA techniques well known in the art. Thus, anucleotide sequence contained in a vector in which 5′ and 3′ restrictionsites are known or for which polymerase chain reaction (PCR) primersequences have been disclosed is considered isolated but a nucleic acidsequence existing in its native state in its natural host is not. Anisolated nucleic acid may be substantially purified, but need not be.For example, a nucleic acid that is isolated within a cloning orexpression vector is not pure in that it may comprise only a tinypercentage of the material in the cell in which it resides. Such anucleic acid is isolated, however, as the term is used herein because itis readily manipulated by standard techniques known to those of ordinaryskill in the art.

[0058] As used herein with respect to polypeptides, the term “isolated”means separated from its native environment in sufficiently pure form sothat it can be manipulated or used for any one of the purposes of theinvention. Thus, isolated means sufficiently pure to be used (i) toraise and/or isolate antibodies, (ii) as a reagent in an assay, (iii)for sequencing, (iv) as a therapeutic, etc.

[0059] According to the invention, isolated nucleic acid molecules thatcode for polypeptides according to the present invention havingmesenchymal cell differentiation induction activity include: (a) nucleicacid molecules which hybridize under stringent conditions to any nucleicacid molecule of SEQ ID NO:1-11 and which code for a polypeptide havingmesenchymal cell differentiation induction activity, (b) nucleic acidmolecules that differ from the nucleic acid molecules of (a) in codonsequence due to the degeneracy of the genetic code, and (c) complementsof (a) or (b). “Complements,” as used herein, includes “full-lengthcomplements or 100% complements of (a) or (b).

[0060] Homologs and alleles of the novel nucleic acids of the invention(SEQ ID NOs:1-11) can be identified by conventional techniques. Thus, anaspect of the invention is those nucleic acid sequences which code forpolypeptides having mesenchymal cell differentiation induction activityand which hybridize to a nucleic acid molecule consisting of the codingregion of SEQ ID NOs:1-11, under stringent conditions. The term“stringent conditions,” as used herein, refers to parameters with whichthe art is familiar. With nucleic acids, hybridization conditions aresaid to be stringent typically under conditions of low ionic strengthand a temperature just below the melting temperature (T_(m)) of the DNAhybrid complex (typically, about 3° C. below the T_(m) of the hybrid).Higher stringency makes for a more specific correlation between theprobe sequence and the target. Stringent conditions used in thehybridization of nucleic acids are well known in the art and may befound in references which compile such methods, e.g. Molecular Cloning:A Laboratory Manual, J. Sambrook, et al., eds., Second Edition, ColdSpring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989, orCurrent Protocols in Molecular Biology, F. M. Ausubel, et al., eds.,John Wiley & Sons, Inc., New York. An example of “stringent conditions”is hybridization at 65° C. in 6×SSC. Another example of stringentconditions is hybridization at 65° C. in hybridization buffer thatconsists of 3.5×SSC, 0.02% Ficoll, 0.02% polyvinyl pyrolidone, 0.02%Bovine Serum Albumin, 2.5 mM NaH₂PO₄[pH 7], 0.5% SDS, 2 mM EDTA. (SSC is0.15M sodium chloride/0.15M sodium citrate, pH 7; SDS is sodium dodecylsulphate; and EDTA is ethylenediaminetetracetic acid). Afterhybridization, the membrane upon which the DNA is transferred is washedat 2×SSC at room temperature and then at 0.1×SSC/0.1×SDS at temperaturesup to 68° C. In a further example, an alternative to the use of anaqueous hybridization solution is the use of a formamide hybridizationsolution. Stringent hybridization conditions can thus be achieved using,for example, a 50% formamide solution and 42° C. There are otherconditions, reagents, and so forth which can be used, and would resultin a similar degree of stringency. The skilled artisan will be familiarwith such conditions, and thus they are not given here. It will beunderstood, however, that the skilled artisan will be able to manipulatethe conditions in a manner to permit the clear identification ofhomologs and alleles of the novel nucleic acids of the invention. Theskilled artisan also is familiar with the methodology for screeningcells and libraries for expression of such molecules which then areroutinely isolated, followed by isolation of the pertinent nucleic acidmolecule and sequencing.

[0061] In general homologs and alleles typically will share at least 40%nucleotide identity and/or at least 50% amino acid identity to any ofSEQ ID NOs:1-11 and their encoded polypeptides, respectively, in someinstances will share at least 50% nucleotide identity and/or at least65% amino acid identity and in still other instances will share at least60% nucleotide identity and/or at least 75% amino acid identity. Infurther instances, homologs and alleles typically will share at least90%, 95%, or even 99% nucleotide identity and/or at least 95%, 98%, oreven 99% amino acid identity to any of SEQ ID NOs:1-11 and their encodedpolypeptides, respectively. The homology can be calculated usingvarious, publicly available software tools developed by NCBI (Bethesda,Md.). Exemplary tools include the heuristic algorithm of Altschul S F,et al., (J Mol Biol, 1990, 215:403-410), also known as BLAST. Pairwiseand ClustalW alignments (BLOSUM30 matrix setting) as well asKyte-Doolittle hydropathic analysis can be obtained using public (EMBL,Heidelberg, Germany) and commercial (e.g., the MacVector sequenceanalysis software from Oxford Molecular Group/enetics Computer Group,Madison, Wis.). Watson-Crick complements of the foregoing nucleic acidsalso are embraced by the invention.

[0062] In screening for genes related to any of SEQ ID NOs:1-11, such astheir homologs and alleles a Southern blot may be performed using theforegoing conditions, together with a radioactive probe. After washingthe membrane to which the DNA is finally transferred, the membrane canbe placed against X-ray film or a phosphoimager plate to detect theradioactive signal.

[0063] Given the teachings herein, full-length human cDNAs, othermammalian sequences such as the mouse cDNA clone corresponding to thehuman DF gene can be isolated from a cDNA library, using standard colonyhybridization techniques.

[0064] The invention also includes degenerate nucleic acids whichinclude alternative codons to those present in the native materials. Forexample, serine residues are encoded by the codons TCA, AGT, TCC, TCG,TCT and AGC. Thus, it will be apparent to one of ordinary skill in theart that any of the serine-encoding nucleotide triplets may be employedto direct the protein synthesis apparatus, in vitro or in vivo, toincorporate a serine residue into an elongating polypeptide. Similarly,nucleotide sequence triplets which encode other amino acid residuesinclude, but are not limited to: CCA, CCC, CCG and CCT (proline codons);CGA, CGC, CGG, CGT, AGA and AGG (arginine codons); ACA, ACC, ACG and ACT(threonine codons); AAC and AAT (asparagine codons); and ATA, ATC andATT (isoleucine codons). Other amino acid residues may be encodedsimilarly by multiple nucleotide sequences. Thus, the invention embracesdegenerate nucleic acids that differ from the biologically isolatednucleic acids in codon sequence due to the degeneracy of the geneticcode.

[0065] The invention also provides isolated unique fragments of any ofSEQ ID NOs:1-11 or complements of thereof. A unique fragment is one thatis a ‘signature’ for the larger nucleic acid. For example, the uniquefragment is long enough to assure that its precise sequence is not foundin molecules within the human genome outside of the nucleic acidsdefined above (SEQ ID NOs:1-11) (and human alleles). Those of ordinaryskill in the art may apply no more than routine procedures to determineif a fragment is unique within the human genome. Unique fragments,however, exclude previously published sequences as of the filing date ofthis application.

[0066] A fragment which is completely composed of a published sequencedescribed in the art as of the filing date of this application, is onewhich does not include any of the nucleotides unique to the sequences ofthe invention. Thus, a unique fragment according to the invention mustcontain a nucleotide sequence other than the exact sequence of those inthe prior art or fragments thereof The difference may be an addition,deletion or substitution with respect to the known sequence or it may bea sequence wholly separate from the known sequence.

[0067] Unique fragments can be used as probes in Southern and Northernblot assays to identify such nucleic acids, or can be used inamplification assays such as those employing PCR. As known to thoseskilled in the art, large probes such as 200, 250, 300 or morenucleotides are preferred for certain uses such as Southern and Northernblots, while smaller fragments will be preferred for uses such as PCR.Unique fragments also can be used to produce fusion proteins forgenerating antibodies or determining binding of the polypeptidefragments, or for generating immunoassay components. Likewise, uniquefragments can be employed to produce nonfused fragments of the novelpolypeptides of the invention, useful, for example, in the preparationof antibodies, immunoassays or therapeutic applications.

[0068] Unique fragments further can be used as antisense molecules toinhibit the expression of any of the novel nucleic acids of theinvention and their encoded polypeptides, respectively. As will berecognized by those skilled in the art, the size of the unique fragmentwill depend upon its conservancy in the genetic code. Thus, some regionsof any of SEQ ID NOs:1-11 and complements will require longer segmentsto be unique while others will require only short segments, typicallybetween 12 and 32 nucleotides long (e.g. 12, 13, 14, 15, 16, 17, 18, 19,20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 and 32 bases) or more, upto the entire length of the disclosed sequence. As mentioned above, thisdisclosure intends to embrace each and every fragment of each sequence,beginning at the first nucleotide, the second nucleotide and so on, upto 8 nucleotides short of the end, and ending anywhere from nucleotidenumber 8, 9, 10 and so on for each sequence, up to the very lastnucleotide, (provided the sequence is unique as described above).Virtually any segment of any of the nucleic acids of SEQ ID NOs:1-11, orcomplements thereof, that is 20 or more nucleotides in length will beunique. Those skilled in the art are well versed in methods forselecting such sequences, typically on the basis of the ability of theunique fragment to selectively distinguish the sequence of interest fromother sequences in the human genome of the fragment to those on knowndatabases typically is all that is necessary, although in vitroconfirmatory hybridization and sequencing analysis may be performed.

[0069] As mentioned above, the invention embraces antisenseoligonucleotides that selectively bind to a nucleic acid moleculeencoding a polypeptide having mesenchymal cell differentiation inductionactivity, to decrease such activity.

[0070] As used herein, the term “antisense oligonucleotide” or“antisense” describes an oligonucleotide that is an oligoribonucleotide,oligodeoxyribonucleotide, modified oligoribonucleotide, or modifiedoligodeoxyribonucleotide which hybridizes under physiological conditionsto DNA comprising a particular gene or to an mRNA transcript of thatgene and, thereby, inhibits the transcription of that gene and/or thetranslation of that mRNA. The antisense molecules are designed so as tointerfere with transcription or translation of a target gene uponhybridization with the target gene or transcript. Those skilled in theart will recognize that the exact length of the antisenseoligonucleotide and its degree of complementarity with its target willdepend upon the specific target selected, including the sequence of thetarget and the particular bases which comprise that sequence. It ispreferred that the antisense oligonucleotide be constructed and arrangedso as to bind selectively with the target under physiologicalconditions, i.e., to hybridize substantially more to the target sequencethan to any other sequence in the target cell under physiologicalconditions. Based upon any of SEQ ID NOs:1-11 or upon allelic orhomologous genomic and/or cDNA sequences, one of skill in the art caneasily choose and synthesize any of a number of appropriate antisensemolecules for use in accordance with the present invention. In order tobe sufficiently selective and potent for inhibition, such antisenseoligonucleotides should comprise at least 10 and, more preferably, atleast 15 consecutive bases which are complementary to the target,although in certain cases modified oligonucleotides as short as 7 basesin length have been used successfully as antisense oligonucleotides(Wagner et al., Nat. Med, 1995, 1(11):1116-1118; Nat. Biotech., 1996,14:840-844). Most preferably, the antisense oligonucleotides comprise acomplementary sequence of 20-30 bases. Although oligonucleotides may bechosen which are antisense to any region of the gene or mRNAtranscripts, in preferred embodiments the antisense oligonucleotidescorrespond to N-terminal or 5′ upstream sites such as translationinitiation, transcription initiation or promoter sites. In addition,3′-untranslated regions may be targeted by antisense oligonucleotides.Targeting to mRNA splicing sites has also been used in the art but maybe less preferred if alternative mRNA splicing occurs. In addition, theantisense is targeted, preferably, to sites in which mRNA secondarystructure is not expected (see, e.g., Sainio et al., Cell Mol.Neurobiol. 14(5):439-457, 1994) and at which proteins are not expectedto bind. Finally, although, SEQ ID No:1 discloses a cDNA sequence, oneof ordinary skill in the art may easily derive the genomic DNAcorresponding to this sequence. Thus, the present invention alsoprovides for antisense oligonucleotides which are complementary to thegenomic DNA corresponding to any of SEQ ID NO:1-11. Similarly, antisenseto allelic or homologous to the cDNAs and genomic DNAs of the inventionare enabled without undue experimentation.

[0071] In one set of embodiments, the antisense oligonucleotides of theinvention may be composed of “natural” deoxyribonucleotides,ribonucleotides, or any combination thereof. That is, the 5′ end of onenative nucleotide and the 3′ end of another native nucleotide may becovalently linked, as in natural systems, via a phosphodiesterinternucleoside linkage. These oligonucleotides may be prepared by artrecognized methods which may be carried out manually or by an automatedsynthesizer. They also may be produced recombinantly by vectors.

[0072] In preferred embodiments, however, the antisense oligonucleotidesof the invention also may include “modified” oligonucleotides. That is,the oligonucleotides may be modified in a number of ways which do notprevent them from hybridizing to their target but which enhance theirstability or targeting or which otherwise enhance their therapeuticeffectiveness.

[0073] The term “modified oligonucleotide” as used herein describes anoligonucleotide in which (1) at least two of its nucleotides arecovalently linked via a synthetic internucleoside linkage (i.e., alinkage other than a phosphodiester linkage between the 5′ end of onenucleotide and the 3′ end of another nucleotide) and/or (2) a chemicalgroup not normally associated with nucleic acids has been covalentlyattached to the oligonucleotide. Preferred synthetic internucleosidelinkages are phosphorothioates, alkylphosphonates, phosphorodithioates,phosphate esters, alkylphosphonothioates, phosphoramidates, carbamates,carbonates, phosphate triesters, acetamidates, carboxymethyl esters andpeptides.

[0074] The term “modified oligonucleotide” also encompassesoligonucleotides with a covalently modified base and/or sugar. Forexample, modified oligonucleotides include oligonucleotides havingbackbone sugars which are covalently attached to low molecular weightorganic groups other than a hydroxyl group at the 3′ position and otherthan a phosphate group at the 5′ position. Thus modifiedoligonucleotides may include a 2′-O-alkylated ribose group. In addition,modified oligonucleotides may include sugars such as arabinose insteadof ribose. The present invention, thus, contemplates pharmaceuticalpreparations containing modified antisense molecules that arecomplementary to and hybridizable with, under physiological conditions,nucleic acids encoding polypeptides having mesenchymal celldifferentiation activity, together with pharmaceutically acceptablecarriers. Antisense oligonucleotides may be administered as part of apharmaceutical composition. Such a pharmaceutical composition mayinclude the antisense oligonucleotides in combination with any standardphysiologically and/or pharmaceutically acceptable carriers which arcknown in the art. The compositions should be sterile and contain atherapeutically effective amount of the antisense oligonucleotides in aunit of weight or volume suitable for administration to a patient. Theterm “pharmaceutically acceptable” means a non-toxic material that doesnot interfere with the effectiveness of the biological activity of theactive ingredients. The term “physiologically acceptable” refers to anon-toxic material that is compatible with a biological system such as acell, cell culture, tissue, or organism. The characteristics of thecarrier will depend on the route of administration. Physiologically andpharmaceutically acceptable carriers include diluents, fillers, salts,buffers, stabilizers, solubilizers, and other materials which are wellknown in the art.

[0075] The invention also involves expression vectors coding forproteins having mesenchymal cell differentiation activity and fragmentsand variants thereof and host cells containing those expression vectors.Virtually any cells, prokaryotic or eukaryotic, which can be transformedwith heterologous DNA or RNA and which can be grown or maintained inculture, may be used in the practice of the invention. Examples includebacterial cells such as Escherichia coli and mammalian cells such asmouse, hamster, pig, goat, primate, etc. They may be of a wide varietyof tissue types, including mast cells, fibroblasts, oocytes andlymphocytes, and they may be primary cells or cell lines. Specificexamples include CHO cells and COS cells. Cell-free transcriptionsystems also may be used in lieu of cells.

[0076] As used herein, a “vector” may be any of a number of nucleicacids into which a desired sequence may be inserted by restriction andligation for transport between different genetic environments or forexpression in a host cell. Vectors are typically composed of DNAalthough RNA vectors are also available. Vectors include, but are notlimited to, plasmids, phagemids and virus genomes. A cloning vector isone which is able to replicate in a host cell, and which is furthercharacterized by one or more endonuclease restriction sites at which thevector may be cut in a determinable fashion and into which a desired DNAsequence may be ligated such that the new recombinant vector retains itsability to replicate in the host cell. In the case of plasmids,replication of the desired sequence may occur many times as the plasmidincreases in copy number within the host bacterium or just a single timeper host before the host reproduces by mitosis. In the case of phage,replication may occur actively during a lytic phase or passively duringa lysogenic phase. An expression vector is one into which a desired DNAsequence may be inserted by restriction and ligation such that it isoperably joined to regulatory sequences and may be expressed as an RNAtranscript. Vectors may further contain one or more marker sequencessuitable for use in the identification of cells which have or have notbeen transformed or transfected with the vector. Markers include, forexample, genes encoding proteins which increase or decrease eitherresistance or sensitivity to antibiotics or other compounds, genes whichencode enzymes whose activities are detectable by standard assays knownin the art (e.g., β-galactosidase or alkaline phosphatase), and geneswhich visibly affect the phenotype of transformed or transfected cells,hosts, colonies or plaques (e.g., green fluorescent protein). Preferredvectors are those capable of autonomous replication and expression ofthe structural gene products present in the DNA segments to which theyare operably joined.

[0077] As used herein, a coding sequence and regulatory sequences aresaid to be “operably” joined when they are covalently linked in such away as to place the expression or transcription of the coding sequenceunder the influence or control of the regulatory sequences. If it isdesired that the coding sequences be translated into a functionalprotein, two DNA sequences are said to be operably joined if inductionof a promoter in the 5′ regulatory sequences results in thetranscription of the coding sequence and if the nature of the linkagebetween the two DNA sequences does not (1) result in the introduction ofa frame-shift mutation, (2) interfere with the ability of the promoterregion to direct the transcription of the coding sequences, or (3)interfere with the ability of the corresponding RNA transcript to betranslated into a protein. Thus, a promoter region would be operablyjoined to a coding sequence if the promoter region were capable ofeffecting transcription of that DNA sequence such that the resultingtranscript might be translated into the desired protein or polypeptide.

[0078] The precise nature of the regulatory sequences needed for geneexpression may vary between species or cell types, but shall in generalinclude, as necessary, 5′ non-transcribed and 5′ non-translatedsequences involved with the initiation of transcription and translationrespectively, such as a TATA box, capping sequence, CAAT sequence, andthe like. Especially, such 5′ non-transcribed regulatory sequences willinclude a promoter region which includes a promoter sequence fortranscriptional control of the operably joined gene. Regulatorysequences may also include enhancer sequences or upstream activatorsequences as desired. The vectors of the invention may optionallyinclude 5′ leader or signal sequences. The choice and design of anappropriate vector is within the ability and discretion of one ofordinary skill in the art.

[0079] Expression vectors containing all the necessary elements forexpression are commercially available and known to those skilled in theart. See, e.g., Sambrook et al., Molecular Cloning: A Laboratory Manual,Second Edition, Cold Spring Harbor Laboratory Press, 1989. Cells aregenetically engineered by the introduction into the cells ofheterologous DNA (RNA) encoding any polypeptide of the invention orfragment or variant thereof. That heterologous DNA (RNA) is placed underoperable control of transcriptional elements to permit the expression ofthe heterologous DNA in the host cell.

[0080] Preferred systems for mRNA expression in mammalian cells arethose such as pRc/CMV (available from Invitrogen, Carlsbad, Calif.) thatcontain a selectable marker such as a gene that confers G418 resistance(which facilitates the selection of stably transfected cell lines) andthe human cytomegalovirus (CMV) enhancer-promoter sequences.Additionally, suitable for expression in primate or canine cell lines isthe pCEP4 vector (Invitrogen, Carlsbad, Calif.), which contains anEpstein Barr virus (EBV) origin of replication, facilitating themaintenance of plasmid as a multicopy extrachromosomal element. Anotherexpression vector is the pEF-BOS plasmid containing the promoter ofpolypeptide Elongation Factor 1α, which stimulates efficientlytranscription in vitro. The plasmid is described by Mishizuma and Nagata(Nuc. Acids Res. 18:5322, 1990), and its use in transfection experimentsis disclosed by, for example, Demoulin (Mol. Cell. Biol. 16:4710-4716,1996). Still another preferred expression vector is an adenovirus,described by Stratford-Perricaudet, which is defective for E1 and E3proteins (J. Clin. Invest. 90:626-630, 1992). The use of the adenovirusas an Adeno.P1A recombinant is disclosed by Warnier et al., inintradermal injection in mice for immunization against P1A (Int. J.Cancer, 67:303-310, 1996).

[0081] The invention also embraces so-called expression kits, whichallow the artisan to prepare a desired expression vector or vectors.Such expression kits include at least separate portions of each of thepreviously discussed coding sequences. Other components may be added, asdesired, as long as the previously mentioned sequences, which arerequired, are included.

[0082] It will also be recognized that the invention embraces the use ofthe above described cDNA sequence containing expression vectors, totransfect host cells and cell lines, be these prokaryotic (e.g.,Escherichia coli), or eukaryotic (e.g., CHO cells, COS cells, yeastexpression systems and recombinant baculovirus expression in insectcells). Especially useful are mammalian cells such as mouse, hamster,pig, goat, primate, etc. They may be of a wide variety of tissue types,and include primary cells and cell lines. Specific examples includedendritic cells, U293 cells, peripheral blood leukocytes, bone marrowstem cells and embryonic stem cells. The invention also permits theconstruction of gene “knock-outs” in cells and in animals, providingmaterials for studying certain aspects of mesenchymal celldifferentiation activity.

[0083] The invention also provides isolated polypeptides havingmesenchymal cell differentiation activity (including whole proteins andpartial proteins), encoded by the foregoing novel nucleic acids, andinclude the polypeptide of SEQ ID NO:12 and unique fragments thereof.Such polypeptides are useful, for example, alone or as part of fusionproteins to generate antibodies, as components of an immunoassay, etc.Polypeptides can be isolated from biological samples including tissue orcell homogenates, and can also be expressed recombinantly in a varietyof prokaryotic and eukaryotic expression systems by constructing anexpression vector appropriate to the expression system, introducing theexpression vector into the expression system, and isolating therecombinantly expressed protein. Short polypeptides, including antigenicpeptides (such as are presented by MHC molecules on the surface of acell for immune recognition) also can be synthesized chemically usingwell-established methods of peptide synthesis.

[0084] A unique fragment of a polypeptide of the present invention, ingeneral, has the features and characteristics of unique fragments asdiscussed above in connection with nucleic acids. As will be recognizedby those skilled in the art, the size of the unique fragment will dependupon factors such as whether the fragment constitutes a portion of aconserved protein domain. Thus, some regions of any encoded polypeptidewill require longer segments to be unique while others will require onlyshort segments, typically between 5 and 12 amino acids (e.g. 5, 6, 7, 8,9, 10, 11 and 12 amino acids long or more, including each integer up tothe full length).

[0085] Unique fragments of a polypeptide preferably are those fragmentswhich retain a distinct functional capability of the polypeptide.Functional capabilities which can be retained in a unique fragment of apolypeptide include interaction with antibodies, interaction with otherpolypeptides or fragments thereof, interaction with other molecules,etc. One important activity is the ability to act as a signature foridentifying the polypeptide. Those skilled in the art are well versed inmethods for selecting unique amino acid sequences, typically on thebasis of the ability of the unique fragment to selectively distinguishthe sequence of interest from non-family members. A comparison of thesequence of the fragment to those on known databases typically is allthat is necessary.

[0086] The invention embraces variants of the polypeptides of theinvention described above. As used herein, a “variant” of a polypeptideof the invention is a polypeptide which contains one or moremodifications to the primary amino acid sequence of a polypeptide of theinvention. Modifications which create a polypeptide variant aretypically made to the nucleic acid which encodes the polypeptide, andcan include deletions, point mutations, truncations, amino acidsubstitutions and addition of amino acids or non-amino acid moietiesto: 1) reduce or eliminate an activity of the polypeptide; 2) enhance aproperty of the polypeptide, such as protein stability in an expressionsystem or the stability of protein-ligand binding; 3) provide a novelactivity or property to the polypeptide, such as addition of anantigenic epitope or addition of a detectable moiety; or 4) to provideequivalent or better binding to a polypeptide receptor or othermolecule. Alternatively, modifications can be made directly to thepolypeptide, such as by cleavage, addition of a linker molecule,addition of a detectable moiety, such as biotin, addition of a fattyacid, and the like. Modifications also embrace fusion proteinscomprising all or part of the amino acid sequence. One of skill in theart will be familiar with methods for predicting the effect on proteinconformation of a change in protein sequence, and can thus “design” avariant polypeptide according to known methods. One example of such amethod is described by Dahiyat and Mayo in Science 278:82-87, 1997,whereby proteins can be designed de novo. The method can be applied to aknown protein to vary only a portion of the polypeptide sequence. Byapplying the computational methods of Dahiyat and Mayo, specificvariants of the polypeptides of the invention can be proposed and testedto determine whether the variant retains a desired conformation.

[0087] Variants can include polypeptides which are modified specificallyto alter a feature of the polypeptide unrelated to its physiologicalactivity. For example, cysteine residues can be substituted or deletedto prevent unwanted disulfide linkages. Similarly, certain amino acidscan be changed to enhance expression of the polypeptide by eliminatingproteolysis by proteases in an expression system (e.g., dibasic aminoacid residues in yeast expression systems in which KEX2 proteaseactivity is present).

[0088] Mutations of a nucleic acid which encodes a polypeptide of theinvention preferably preserve the amino acid reading frame of the codingsequence, and preferably do not create regions in the nucleic acid whichare likely to hybridize to form secondary structures, such a hairpins orloops, which can be deleterious to expression of the variantpolypeptide.

[0089] Mutations can be made by selecting an amino acid substitution, orby random mutagenesis of a selected site in a nucleic acid which encodesthe polypeptide. Variant polypeptides are then expressed and tested forone or more activities to determine which mutation provides a variantpolypeptide with the desired properties. Further mutations can be madeto variants (or to non-variant polypeptides) which are silent as to theamino acid sequence of the polypeptide, but which provide preferredcodons for translation in a particular host. The preferred codons fortranslation of a nucleic acid in, e.g., Escherichia coli, are well knownto those of ordinary skill in the art. Still other mutations can be madeto the noncoding sequences of a gene or cDNA encoding the polypeptide toenhance expression of the polypeptide.

[0090] The skilled artisan will realize that conservative amino acidsubstitutions may be made in any of the polypeptides of the invention toprovide functionally equivalent variants of the foregoing polypeptides,i.e, the variants retain the functional capabilities of the polypeptidesof the invention. As used herein, a “conservative amino acidsubstitution” refers to an amino acid substitution which does notsignificantly alter the tertiary structure and/or activity of thepolypeptide. Variants can be prepared according to methods for alteringpolypeptide sequence known to one of ordinary skill in the art, andinclude those that are found in references which compile such methods,e.g. Molecular Cloning: A Laboratory Manual, J. Sambrook, et al., eds.,Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor,N.Y., 1989, or Current Protocols in Molecular Biology, F. M. Ausubel, etal., eds., John Wiley & Sons, Inc., New York. Exemplary functionallyequivalent variants of the polypeptides of the invention includeconservative amino acid substitutions (e.g. of SEQ ID NO:13).Conservative substitutions of amino acids include substitutions madeamongst amino acids within the following groups: (a) M, I, L, V; (b) F,Y, W; (c) K, R, H; (d) A, G; (e) S, T; (f) Q, N; and (g) E, D.

[0091] Thus functionally equivalent variants of polypeptides of theinvention, i.e., variants of the polypeptides which retain the functionof each of the natural polypeptides, are contemplated by the invention.Conservative amino-acid substitutions in the amino acid sequence ofpolypeptides of the invention to produce functionally equivalentvariants typically are made by alteration of a nucleic acid encodingeach polypeptide (e.g., SEQ ID NOs:1-11). Such substitutions can be madeby a variety of methods known to one of ordinary skill in the art. Forexample, amino acid substitutions may be made by PCR-directed mutation,site-directed mutagenesis according to the method of Kunkel (Kunkel,Proc. Nat. Acad. Sci. U.S.A. 82: 488-492, 1985), or by chemicalsynthesis of a gene encoding a polypeptide of the invention. Theactivity of functionally equivalent fragments of polypeptides of theinvention can be tested by cloning the gene encoding the alteredpolypeptide of the invention into a bacterial or mammalian expressionvector, introducing the vector into an appropriate host cell, expressingthe altered polypeptide, and testing for a functional capability of thepolypeptides as disclosed herein (e.g., mesenchymal cell differentiationinduction activity, etc.).

[0092] The invention as described herein has a number of uses, some ofwhich are described elsewhere herein. First, the invention permitsisolation of polypeptides having mesenchymal cell differentiationinduction activity. A variety of methodologies well-known to the skilledpractitioner can be utilized to obtain isolated polypeptides. Thepolypeptide may be purified from cells which naturally produce thepolypeptide by chromatographic means or immunological recognition.Alternatively, an expression vector may be introduced into cells tocause production of the polypeptide. In another method, mRNA transcriptsmay be microinjected or otherwise introduced into cells to causeproduction of the encoded polypeptide. Translation of an mRNA of theinvention in cell-free extracts such as the reticulocyte lysate systemalso may be used to produce polypeptides. Those skilled in the art alsocan readily follow known methods for isolating polypeptides. Theseinclude, but are not limited to, immunochromatography, HPLC,size-exclusion chromatography, ion-exchange chromatography andimmune-affinity chromatography.

[0093] The invention also provides, in certain embodiments, “dominantnegative” polypeptides derived from polypeptides of the invention. Adominant negative polypeptide is an inactive variant of a protein,which, by interacting with the cellular machinery, displaces an activeprotein from its interaction with the cellular machinery or competeswith the active protein, thereby reducing the effect of the activeprotein. For example, a dominant negative receptor which binds a ligandbut does not transmit a signal in response to binding of the ligand canreduce the biological effect of expression of the ligand. Likewise, adominant negative catalytically-inactive kinase which interacts normallywith target proteins but does not phosphorylate the target proteins canreduce phosphorylation of the target proteins in response to a cellularsignal. Similarly, a dominant negative transcription factor which bindsto a promoter site in the control region of a gene but does not increasegene transcription can reduce the effect of a normal transcriptionfactor by occupying promoter binding sites without increasingtranscription.

[0094] The end result of the expression of a dominant negativepolypeptide in a cell is a reduction in function of active proteins. Oneof ordinary skill in the art can assess the potential for a dominantnegative variant of a protein, and use standard mutagenesis techniquesto create one or more dominant negative variant polypeptides. See, e.g.,U.S. Pat. No. 5,580,723 and Sambrook et al., Molecular Cloning: ALaboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press,1989. The skilled artisan then can test the population of mutagenizedpolypeptides for diminution in a selected activity and/or for retentionof such an activity. Other similar methods for creating and testingdominant negative variants of a protein will be apparent to one ofordinary skill in the art.

[0095] The isolation of the cDNAs of the invention (SEQ ID NOs:1-11)also makes it possible for the artisan to diagnose a disordercharacterized by an aberrant expression of any gene encoded by suchcDNAs. These methods involve determining expression of the gene, and/orpolypeptides derived therefrom. In the former situation, suchdeterminations can be carried out via any standard nucleic aciddetermination assay, including the polymerase chain reaction, orassaying with labeled hybridization probes as exemplified below. In thelatter situation, such determination can be carried out via any standardimmunological assay using, for example, antibodies which bind to thesecreted protein.

[0096] The invention also embraces isolated peptide binding agentswhich, for example, can be antibodies or fragments of antibodies(“binding polypeptides”), having the ability to selectively bind topolypeptides of the present invention. Antibodies include polyclonal andmonoclonal antibodies, prepared according to conventional methodology.

[0097] Significantly, as is well-known in the art, only a small portionof an antibody molecule, the paratope, is involved in the binding of theantibody to its epitope (see, in general, Clark, W. R. (1986) TheExperimental Foundations of Modern Immunology Wiley & Sons, Inc., NewYork; Roitt, I. (1991) Essential Immunology, 7th Ed., BlackwellScientific Publications, Oxford). The pFc′ and Fe regions, for example,are effectors of the complement cascade but are not involved in antigenbinding. An antibody from which the pFc′ region has been enzymaticallycleaved, or which has been produced without the pFc′ region, designatedan F(ab′)₂ fragment, retains both of the antigen binding sites of anintact antibody. Similarly, an antibody from which the Fc region hasbeen enzymatically cleaved, or which has been produced without the Fcregion, designated an Fab fragment, retains one of the antigen bindingsites of an intact antibody molecule. Proceeding further, Fab fragmentsconsist of a covalently bound antibody light chain and a portion of theantibody heavy chain denoted Fd. The Fd fragments are the majordeterminant of antibody specificity (a single Fd fragment may beassociated with up to ten different light chains without alteringantibody specificity) and Fd fragments retain epitope-binding ability inisolation.

[0098] Within the antigen-binding portion of an antibody, as iswell-known in the art, there are complementarity determining regions(CDRs), which directly interact with the epitope of the antigen, andframework regions (FRs), which maintain the tertiary structure of theparatope (see, in general, Clark, 1986; Roitt, 1991). In both the heavychain Fd fragment and the light chain of IgG immunoglobulins, there arefour framework regions (FR1 through FR4) separated respectively by threecomplementarity determining regions (CDR1 through CDR3). The CDRs, andin particular the CDR3 regions, and more particularly the heavy chainCDR3, are largely responsible for antibody specificity.

[0099] It is now well-established in the art that the non-CDR regions ofa mammalian antibody may be replaced with similar regions of conspecificor heterospecific antibodies while retaining the epitopic specificity ofthe original antibody. This is most clearly manifested in thedevelopment and use of “humanized” antibodies in which non-human CDRsare covalently joined to human FR and/or Fc/pFc′ regions to produce afunctional antibody. See, e.g., U.S. Pat. Nos. 4,816,567, 5,225,539,5,585,089, 5,693,762 and 5,859,205. Thus, for example, PCT InternationalPublication Number WO 92/04381 teaches the production and use ofhumanized murine RSV antibodies in which at least a portion of themurine FR regions have been replaced by FR regions of human origin. Suchantibodies, including fragments of intact antibodies withantigen-binding ability, are often referred to as “chimeric” antibodies.

[0100] Thus, as will be apparent to one of ordinary skill in the art,the present invention also provides for F(ab′)₂, Fab, Fv and Fdfragments; chimeric antibodies in which the Fc and/or FR and/or CDR1and/or CDR2 and/or light chain CDR3 regions have been replaced byhomologous human or non-human sequences; chimeric F(ab′)₂ fragmentantibodies in which the FR and/or CDR1 and/or CDR2 and/or light chainCDR3 regions have been replaced by homologous human or non-humansequences; chimeric Fab fragment antibodies in which the FR and/or CDR1and/or CDR2 and/or light chain CDR3 regions have been replaced byhomologous human or non-human sequences; and chimeric Fd fragmentantibodies in which the FR and/or CDR1 and/or CDR2 regions have beenreplaced by homologous human or non-human sequences. The presentinvention also includes so-called single chain antibodies.

[0101] Thus, the invention involves polypeptides of numerous size andtype that bind specifically to polypeptides of the invention, andcomplexes of both polypeptides and their binding partners. Thesepolypeptides may be derived also from sources other than antibodytechnology. For example, such polypeptide binding agents can be providedby degenerate peptide libraries which can be readily prepared insolution, in immobilized form, as bacterial flagella peptide displaylibraries or as phage display libraries. Combinatorial libraries alsocan be synthesized of peptides containing one or more amino acids.Libraries further can be synthesized of peptides and non-peptidesynthetic moieties.

[0102] Phage display can be particularly effective in identifyingbinding peptides useful according to the invention. Briefly, oneprepares a phage library (using e.g. m13, fd, or lambda phage),displaying inserts from 4 to about 80 amino acid residues usingconventional procedures. The inserts may represent, for example, acompletely degenerate or biased array. One then can select phage-bearinginserts which bind to the polypeptide or a complex of the polypeptideand a binding partner. This process can be repeated through severalcycles of reselection of phage that bind to the polypeptide or complex.Repeated rounds lead to enrichment of phage bearing particularsequences. DNA sequence analysis can be conducted to identify thesequences of the expressed polypeptides. The minimal linear portion ofthe sequence that binds to the polypeptide or complex can be determined.One can repeat the procedure using a biased library containing insertscontaining part or all of the minimal linear portion plus one or moreadditional degenerate residues upstream or downstream thereof. Yeasttwo-hybrid screening methods also may be used to identify polypeptidesthat bind to the polypeptides of the invention. Thus, the polypeptidesof the invention, or a fragment thereof, or complexes of a polypeptideand a binding partner can be used to screen peptide libraries, includingphage display libraries, to identify and select peptide binding partnersof the polypeptides of the invention. Such molecules can be used, asdescribed, for screening assays, for purification protocols, forinterfering directly with the functioning of the polypeptide and forother purposes that will be apparent to those of ordinary skill in theart.

[0103] An polypeptide of the invention, or a fragment thereof, also canbe used to isolate their native binding partners. Isolation of bindingpartners may be performed according to well-known methods. For example,isolated polypeptides can be attached to a substrate, and then asolution suspected of containing a binding partner of the polypeptidemay be applied to the substrate. If the binding partner for apolypeptide of the invention is present in the solution, then it willbind to the substrate-bound polypeptide. The binding partner then may beisolated. Other proteins which are binding partners for a polypeptide ofthe invention, may be isolated by similar methods without undueexperimentation.

[0104] The invention also provides methods to measure the level of geneexpression in a subject. This can be performed by first obtaining a testsample from the subject. The test sample can be tissue or biologicalfluid. Tissues include brain, heart, serum, breast, colon, bladder,uterus, prostate, stomach, testis, ovary, pancreas, pituitary gland,adrenal gland, thyroid gland, salivary gland, mammary gland, kidney,liver, intestine, spleen, thymus, blood vessels, bone marrow, trachea,and lung. In certain embodiments, test samples originate from heart andblood vessel tissues, and biological fluids include blood, saliva andurine. Both invasive and non-invasive techniques can be used to obtainsuch samples and are well documented in the art. At the molecular levelboth PCR and Northern blotting can be used to determine the level of SEQID NOs:1-11 mRNA using products of this invention described herein, andprotocols well known in the art that are found in references whichcompile such methods. At the protein level, polypeptide expression canbe determined using either polyclonal or monoclonal anti-polypeptidesera in combination with standard immunological assays. The preferredmethods will compare the measured level of expression of the test sampleto a control. A control can include a known amount of a nucleic acidprobe, an epitope (such as an expression product of any of SEQ IDNOs:1-11), or a similar test sample of a subject with a control or‘normal’ level of expression.

[0105] Polypeptides of the invention preferably are producedrecombinantly, although such polypeptides may be isolated frombiological extracts. Recombinantly produced polypeptides includechimeric proteins comprising a fusion of a protein with anotherpolypeptide, e.g., a polypeptide capable of providing or enhancingprotein-protein binding, sequence specific nucleic acid binding (such asGAL4), enhancing stability of the polypeptide of the invention underassay conditions, or providing a detectable moiety, such as greenfluorescent protein. A polypeptide fused to a polypeptide of theinvention or fragment may also provide means of readily detecting thefusion protein, e.g., by immunological recognition or by fluorescentlabeling.

[0106] The invention also is useful in the generation of transgenicnon-human animals. As used herein, “transgenic non-human animals”includes non-human animals having one or more exogenous nucleic acidmolecules incorporated in germ line cells and/or somatic cells. Thus thetransgenic animals include “knockout” animals having a homozygous orheterozygous gene disruption by homologous recombination, animals havingepisomal or chromosomally incorporated expression vectors, etc. Knockoutanimals can be prepared by homologous recombination using embryonic stemcells as is well known in the art. The recombination may be facilitatedusing, for example, the cre/lox system or other recombinase systemsknown to one of ordinary skill in the art. In certain embodiments, therecombinase system itself is expressed conditionally, for example, incertain tissues or cell types, at certain embryonic or post-embryonicdevelopmental stages, is induced by the addition of a compound whichincreases or decreases expression, and the like. In general, theconditional expression vectors used in such systems use a variety ofpromoters which confer the desired gene expression pattern (e.g.,temporal or spatial). Conditional promoters also can be operably linkedto nucleic acid molecules of the invention to increase expression of itsencoded gene and/or polypeptide in a regulated or conditional manner.Trans-acting negative regulators of each gene's activity or expressionalso can be operably linked to a conditional promoter as describedabove. Such trans-acting regulators include antisense nucleic acidsmolecules, nucleic acid molecules which encode dominant negativemolecules, ribozyme molecules specific for each nucleic acid of theinvention, and the like. The transgenic non-human animals are useful inexperiments directed toward testing biochemical or physiological effectsof diagnostics or therapeutics for conditions characterized by increasedor decreased gene expression. Other uses will be apparent to one ofordinary skill in the art.

[0107] The invention also contemplates gene therapy. The procedure forperforming ex vivo gene therapy is outlined in U.S. Pat. No. 5,399,346and in exhibits submitted in the file history of that patent, all ofwhich are publicly available documents. In general, it involvesintroduction in vitro of a functional copy of a gene into a cell(s) of asubject which contains a defective copy of the gene, and returning thegenetically engineered cell(s) to the subject. The functional copy ofthe gene is under operable control of regulatory elements which permitexpression of the gene in the genetically engineered cell(s). Numeroustransfection and transduction techniques as well as appropriateexpression vectors are well known to those of ordinary skill in the art,some of which are described in PCT application WO95/00654. In vivo genetherapy using vectors such as adenovirus, retroviruses, herpes virus,and targeted liposomes also is contemplated according to the invention.

[0108] The invention further provides efficient methods of identifyingagents or lead compounds for agents active at the level of a polypeptideof the invention, or of a fragment thereof, dependent cellular function.In particular, such functions include interaction with otherpolypeptides or fragments. Generally, the screening methods involveassaying for compounds which interfere with polypeptide activity (suchas mesenchymal cell differentiation induction activity), althoughcompounds which enhance mesenchymal cell differentiation inductionactivity of a polypeptide of the invention also can be assayed using thescreening methods. Such methods are adaptable to automated, highthroughput screening of compounds. Target indications include cellularprocesses modulated by a polypeptide of the invention such asmesenchymal cell differentiation induction activity.

[0109] A wide variety of assays for candidate (pharmacological) agentsare provided, including, labeled in vitro protein-ligand binding assays,electrophoretic mobility shift assays, immunoassays, cell-based assayssuch as two- or three-hybrid screens, expression assays, etc. Thetransfected nucleic acids can encode, for example, combinatorial peptidelibraries or cDNA libraries. Convenient reagents for such assays, e.g.,GAL4 fusion proteins, are known in the art. An exemplary cell-basedassay involves transfecting a cell with a nucleic acid encoding apolypeptide of the invention fused to a GAL4 DNA binding domain and anucleic acid encoding a reporter gene operably linked to a geneexpression regulatory region, such as one or more GAL4 binding sites.Activation of reporter gene transcription occurs when a polypeptide ofthe invention and a reporter fusion polypeptide bind such as to enabletranscription of the reporter gene. Agents which modulate mediated cellfunction of a polypeptide of the invention are then detected through achange in the expression of reporter gene. Methods for determiningchanges in the expression of a reporter gene are known in the art.

[0110] Polypeptide fragments used in the methods, when not produced by atransfected nucleic acid are added to an assay mixture as an isolatedpolypeptide. Polypeptides of the invention preferably are producedrecombinantly, although such polypeptides may be isolated frombiological extracts. Recombinantly produced polypeptides includechimeric proteins comprising a fusion of a polypeptide of the inventionwith another polypeptide, e.g., a polypeptide capable of providing orenhancing protein-protein binding, sequence specific nucleic acidbinding (such as GAL4), enhancing stability of the polypeptide of theinvention under assay conditions, or providing a detectable moiety, suchas green fluorescent protein or Flag epitope.

[0111] The assay mixture is comprised of a natural intracellular bindingtarget of a polypeptide of the invention capable of interacting with apolypeptide of the invention. While natural binding targets of apolypeptide of the invention may be used, it is frequently preferred touse portions (e.g., peptides or nucleic acid fragments) or analogs(i.e., agents which mimic the binding properties of the natural bindingtarget for purposes of the assay) of the binding target a polypeptide ofthe invention so long as the portion or analog provides binding affinityand avidity to a fragment of the polypeptide of the invention measurablein the assay.

[0112] The assay mixture also comprises a candidate agent. Typically, aplurality of assay mixtures are run in parallel with different agentconcentrations to obtain a different response to the variousconcentrations. Typically, one of these concentrations serves as anegative control, i.e., at zero concentration of agent or at aconcentration of agent below the limits of assay detection. Candidateagents encompass numerous chemical classes, although typically they areorganic compounds. Preferably, the candidate agents are small organiccompounds, i.e., those having a molecular weight of more than 50 yetless than about 2500, preferably less than about 1000 and, morepreferably, less than about 500. Candidate agents comprise functionalchemical groups necessary for structural interactions with polypeptidesand/or nucleic acids, and typically include at least an amine, carbonyl,hydroxyl or carboxyl group, preferably at least two of the functionalchemical groups and more preferably at least three of the functionalchemical groups. The candidate agents can comprise cyclic carbon orheterocyclic structure and/or aromatic or polyaromatic structuressubstituted with one or more of the above-identified functional groups.Candidate agents also can be biomolecules such as peptides, saccharides,fatty acids, sterols, isoprenoids, purines, pyrimidines, derivatives orstructural analogs of the above, or combinations thereof and the like.Where the agent is a nucleic acid, the agent typically is a DNA or RNAmolecule, although modified nucleic acids as defined herein are alsocontemplated.

[0113] Candidate agents are obtained from a wide variety of sourcesincluding libraries of synthetic or natural compounds. For example,numerous means are available for random and directed synthesis of a widevariety of organic compounds and biomolecules, including expression ofrandomized oligonucleotides, synthetic organic combinatorial libraries,phage display libraries of random peptides, and the like. Alternatively,libraries of natural compounds in the form of bacterial, fungal, plantand animal extracts are available or readily produced. Additionally,natural and synthetically produced libraries and compounds can bemodified through conventional chemical, physical, and biochemical means.Further, known (pharmacological) agents may be subjected to directed orrandom chemical modifications such as acylation, alkylation,esterification, amidification, etc. to produce structural analogs of theagents.

[0114] A variety of other reagents also can be included in the mixture.These include reagents such as salts, buffers, neutral proteins (e.g.,albumin), detergents, etc. which may be used to facilitate optimalprotein-protein and/or protein-nucleic acid binding. Such a reagent mayalso reduce non-specific or background interactions of the reactioncomponents. Other reagents that improve the efficiency of the assay suchas protease, inhibitors, nuclease inhibitors, antimicrobial agents, andthe like may also be used.

[0115] The mixture of the foregoing assay materials is incubated underconditions whereby, but for the presence of the candidate agent, thepolypeptide of the invention specifically binds a cellular bindingtarget, a portion thereof or analog thereof. The order of addition ofcomponents, incubation temperature, time of incubation, and otherparameters of the assay may be readily determined. Such experimentationmerely involves optimization of the assay parameters, not thefundamental composition of the assay. Incubation temperatures typicallyare between 4° C. and 40° C. Incubation times preferably are minimizedto facilitate rapid, high throughput screening, and typically arebetween 0.1 and 10 hours.

[0116] After incubation, the presence or absence of specific bindingbetween the polypeptide of the invention and one or more binding targetsis detected by any convenient method available to the user. For cellfree binding type assays, a separation step is often used to separatebound from unbound components. The separation step may be accomplishedin a variety of ways. Conveniently, at least one of the components isimmobilized on a solid substrate, from which the unbound components maybe easily separated. The solid substrate can be made of a wide varietyof materials and in a wide variety of shapes, e.g., microtiter plate,microbead, dipstick, resin particle, etc. The substrate preferably ischosen to maximum signal to noise ratios, primarily to minimizebackground binding, as well as for ease of separation and cost.

[0117] Separation may be effected for example, by removing a bead ordipstick from a reservoir, emptying or diluting a reservoir such as amicrotiter plate well, rinsing a bead, particle, chromatograpic columnor filter with a wash solution or solvent. The separation steppreferably includes multiple rinses or washes. For example, when thesolid substrate is a microtiter plate, the wells may be washed severaltimes with a washing solution, which typically includes those componentsof the incubation mixture that do not participate in specific bindingssuch as salts, buffer, detergent, non-specific protein, etc. Where thesolid substrate is a magnetic bead, the beads may be washed one or moretimes with a washing solution and isolated using a magnet.

[0118] Detection may be effected in any convenient way for cell-basedassays such as two- or three-hybrid screens. The transcript resultingfrom a reporter gene transcription assay of a polypeptide of theinvention interacting with a target molecule typically encodes adirectly or indirectly detectable product, e.g., β-galactosidaseactivity, luciferase activity, and the like. For cell free bindingassays, one of the components usually comprises, or is coupled to, adetectable label. A wide variety of labels can be used, such as thosethat provide direct detection (e.g., radioactivity, luminescence,optical or electron density, etc), or indirect detection (e.g., epitopetag such as the FLAG epitope, enzyme tag such as horseseradishperoxidase, etc.). The label may be bound to a binding partner of apolypeptide of the invention, or incorporated into the structure of thebinding partner.

[0119] A variety of methods may be used to detect the label, dependingon the nature of the label and other assay components. For example, thelabel may be detected while bound to the solid substrate or subsequentto separation from the solid substrate. Labels may be directly detectedthrough optical or electron density, radioactive emissions, nonradiativeenergy transfers, etc. or indirectly detected with antibody conjugates,streptavidin-biotin conjugates, etc. Methods for detecting the labelsare well known in the art.

[0120] The invention provides specific binding agents to any of thepolypeptides of the invention, methods of identifying and making suchagents, and their use in diagnosis, therapy and pharmaceuticaldevelopment. For example, pharmacological agents specific for any of thepolypeptides of the invention are useful in a variety of diagnostic andtherapeutic applications, especially where disease or disease prognosisis associated with altered polypeptide binding characteristics. Novelbinding agents specific for any of the polypeptides of the inventioninclude specific antibodies, cell surface receptors, and other naturalintracellular and extracellular binding agents identified with assayssuch as two hybrid screens, and non-natural intracellular andextracellular binding agents identified in screens of chemical librariesand the like.

[0121] In general, the specificity of binding of any of the polypeptidesof the invention to a specific molecule is determined by bindingequilibrium constants. Targets which are capable of selectively bindingany of the polypeptides of the invention preferably have bindingequilibrium constants of at least about 10⁷ M⁻¹, more preferably atleast about 10⁸ M⁻¹, and most preferably at least about 10⁹ M⁻¹. A widevariety of cell based and cell free assays may be used to demonstratespecific binding. Cell based assays include one, two and three hybridscreens, assays in which polypeptide mediated transcription is inhibitedor increased, etc. Cell free assays include protein binding assays,immunoassays, etc. Other assays useful for screening agents which bindany of the polypeptides of the invention include fluorescence resonanceenergy transfer (FRET), and electrophoretic mobility shift analysis(EMSA).

[0122] According to another aspect of the invention, a method foridentifying an agent useful in modulating mesenchymal celldifferentiation induction activity of a molecule of the invention, isprovided. The method involves (a) contacting a molecule havingmesenchymal cell differentiation induction activity with a candidateagent, (b) measuring mesenchymal cell differentiation induction activityof the molecule, and (c) comparing the measured mesenchymal celldifferentiation induction activity of the molecule to a control todetermine whether the candidate agent modulates mesenchymal celldifferentiation induction activity of the molecule, wherein the moleculeis any nucleic acid molecule of SEQ ID NO:1-11, and 13-66, or anexpression product thereof. “Contacting” refers to both direct andindirect contacting of a molecule having mesenchymal celldifferentiation induction activity with the candidate agent. “Indirect”contacting means that the candidate agent exerts its effects on themesenchymal cell differentiation induction activity of the molecule viaa third agent (e.g., a messenger molecule, a receptor, etc.). In certainembodiments, the control is mesenchymal cell differentiation inductionactivity of the molecule measured in the absence of the candidate agent.Assaying methods and candidate agents are as described above in theforegoing embodiments.

[0123] According to still another aspect of the invention, a method ofdiagnosing a disorder characterized by aberrant expression of a nucleicacid molecule, an expression product thereof, or a fragment of anexpression product thereof, is provided. The method involves contactinga biological sample isolated from a subject with an agent thatspecifically binds to the nucleic acid molecule, an expression productthereof, or a fragment of an expression product thereof, and determiningthe interaction between the agent and the nucleic acid molecule or theexpression product as a determination of the disorder, wherein thenucleic acid molecule is any nucleic acid molecule of SEQ ID NO:1-11,and 13-66. In some embodiments, the disorder is a cartilaginous tissuedegeneration condition is selected from the group consisting ofosteoarthritis, rheumatoid arthritis, osteochondrosis. In oneembodiment, the disorder is osteoarthritis.

[0124] In the case where the molecule is a nucleic acid molecule, suchdeterminations can be carried out via any standard nucleic aciddetermination assay, including the polymerase chain reaction, orassaying with labeled hybridization probes as exemplified herein. In thecase where the molecule is an expression product of the nucleic acidmolecule, or a fragment of an expression product of the nucleic acidmolecule, such determination can be carried out via any standardimmunological assay using, for example, antibodies which bind to any ofthe polypeptide expression products.

[0125] “Aberrant expression” refers to decreased expression(underexpression) or increased expression (overexpression) of any of theforegoing molecules (SEQ ID NOs: 1-67), nucleic acids and/orpolypeptides) in comparison with a control (i.e., expression of the samemolecule in a healthy or “normal” subject). A “healthy subject”, as usedherein, refers to a subject who is not at risk for developing a futureskeletal degeneration condition. Healthy subjects also do not otherwiseexhibit symptoms of disease. In other words, such subjects, if examinedby a medical professional, would be characterized as healthy and free ofsymptoms of a skeletal degeneration condition or at risk of developingskeletal degeneration condition.

[0126] When the disorder is a skeletal degeneration condition selectedfrom the group consisting of selected from the group consisting ofosteoarthritis, rheumatoid arthritis, osteochondrosis, decreasedexpression of any of the foregoing molecules in comparison with acontrol (e.g., a healthy individual) is indicative of the presence ofthe disorder, or indicative of the risk for developing such disorder inthe future.

[0127] The invention also provides novel kits which could be used tomeasure the levels of the nucleic acids of the invention, or expressionproducts of the invention.

[0128] In one embodiment, a kit comprises a package containing an agentthat selectively binds to any of the foregoing novel, isolated nucleicacids (SEQ ID NOs: 1-11), or expression products thereof, and a controlfor comparing to a measured value of binding of said agent any of theforegoing novel, isolated nucleic acids or expression products thereof.In some embodiments, the control is a predetermined value for comparingto the measured value. In certain embodiments, the control comprises anepitope of the expression product of any of the foregoing novel,isolated nucleic acids. In one embodiment, the kit further comprises asecond agent that selectively binds to any of the foregoing novelmolecules (SEQ ID NOs:1-11), and/or an expression products thereof, anda control for comparing to a measured value of binding of said secondagent to said isolated nucleic acid molecule or expression productthereof.

[0129] In the case of nucleic acid detection, pairs of primers foramplifying a nucleic acid molecule of the invention can be included. Thepreferred kits would include controls such as known amounts of nucleicacid probes, epitopes (such as expression products of any of theforegoing novel nucleic acid molecules SEQ ID NOs:1-11, e.g., SEQ IDNO:12) or anti-epitope antibodies, as well as instructions or otherprinted material. In certain embodiments the printed material cancharacterize risk of developing a skeletal degeneration condition basedupon the outcome of the assay. The reagents may be packaged incontainers and/or coated on wells in predetermined amounts, and the kitsmay include standard materials such as labeled immunological reagents(such as labeled anti-IgG antibodies) and the like. One kit is apackaged polystyrene microtiter plate coated with a polypeptide of theinvention and a container containing labeled anti-human IgG antibodies.A well of the plate is contacted with, for example, a biological fluid,washed and then contacted with the anti-IgG antibody. The label is thendetected. A kit embodying features of the present invention, generallydesignated by the numeral 11, is illustrated in FIG. 1. Kit 11 iscomprised of the following major elements: packaging 15, an agent of theinvention 17, a control agent 19 and instructions 21. Packaging 15 is abox-like structure for holding a vial (or number of vials) containing anagent of the invention 17, a vial (or number of vials) containing acontrol agent 19, and instructions 21. Individuals skilled in the artcan readily modify packaging 15 to suit individual needs.

[0130] The invention also embraces methods for treating a cartilaginoustissue degeneration condition. The method involves administering to asubject in need of such treatment an agent that modulates expression ofa molecule selected from the group consisting of any of SEQ ID NOs:1-67(or expression products thereof in the case of nucleic acids), in anamount effective to treat the cartilaginous tissue degenerationcondition.

[0131] “Agents that modulate expression” of a nucleic acid or apolypeptide, as used herein, are known in the art, and refer to senseand antisense nucleic acids, dominant negative nucleic acids, antibodiesto the polypeptides, and the like. Any agents that modulate exression ofa molecule (and as described herein, modulate its activity), are usefulaccording to the invention.

[0132] As used herein, “downregulating expression” refers to inhibiting(i.e., reducing to a detectable extent) replication, transcription,and/or translation of a nucleic acid molecule of the invention, or anexpression product thereof, since inhibition of any of these processesresults in a decrease in the concentration/amount of the polypeptideencoded by the gene. The term also refers to inhibition ofpost-translational modifications on the polypeptide (e.g., in itsphosphorylation), since inhibition of such modifications may alsoprevent proper expression (i.e., expression as in a wild type cell) ofthe encoded polypeptide. The term also refers to an increase in, orfacilitation of, polypeptide degradation (e.g., via increasedubiquitinization). Polypeptide turnover can be determined using methodswell known in the art and described elsewhere herein. The inhibition ofgene expression can be directly determined by detecting a decrease inthe level of mRNA for the gene, or the level of protein expression ofthe gene, using any suitable means known to the art, such as nucleicacid hybridization or antibody detection methods, respectively.Inhibition of gene expression can also be determined indirectly bydetecting a change in mesenchymal cell differentiation inductionactivity of the molecule as a whole.

[0133] In certain embodiments, the molecule is a nucleic acid. In someembodiments the nucleic acid is operatively coupled to a gene expressionsequence which directs the expression of the nucleic acid moleculewithin a eukaryotic cell such as a mesenchymal cell (e.g., a dermalfibroblast). The “gene expression sequence” is any regulatory nucleotidesequence, such as a promoter sequence or promoter-enhancer combination,which facilitates the efficient transcription and translation of thenucleic acid to which it is operably linked. The gene expressionsequence may, for example, be a mammalian or viral promoter, such as aconstitutive or inducible promoter. Constitutive mammalian promotersinclude, but are not limited to, the promoters for the following genes:hypoxanthine phosphoribosyl transferase (HPTR), adenosine deaminase,pyruvate kinase, α-actin promoter and other constitutive promoters.Exemplary viral promoters which function constitutively in eukaryoticcells include, for example, promoters from the simian virus, papillomavirus, adenovirus, human immunodeficiency virus (HIV), Rous sarcomavirus, cytomegalovirus, the long terminal repeats (LTR) of moloneyleukemia virus and other retroviruses, and the thymidine kinase promoterof herpes simplex virus. Other constitutive promoters are known to thoseof ordinary skill in the art. The promoters useful as gene expressionsequences of the invention also include inducible promoters. Induciblepromoters are activated in the presence of an inducing agent. Forexample, the metallothionein promoter is activated to increasetranscription and translation in the presence of certain metal ions.Other inducible promoters are known to those of ordinary skill in theart.

[0134] In general, the gene expression sequence shall include, asnecessary, 5′ non-transcribing and 5′ non-translating sequences involvedwith the initiation of transcription and translation, respectively, suchas a TATA box, capping sequence, CAAT sequence, and the like.Especially, such 5′ non-transcribing sequences will include a promoterregion which includes a promoter sequence for transcriptional control ofthe operably joined nucleic acid. The gene expression sequencesoptionally includes enhancer sequences or upstream activator sequencesas desired.

[0135] Preferably, any of the nucleic acid molecules of the invention(e.g., SEQ ID NO:1-11, and 13-66) is linked to a gene expressionsequence which permits expression of the nucleic acid molecule in a cellsuch as a mesenchymal cell (e.g., dermal fibroblast). A sequence whichpermits expression of the nucleic acid molecule in a cell such as amesenchymal cell (e.g., a dermal fibroblast), is one which isselectively active in such a cell type, thereby causing expression ofthe nucleic acid molecule in these cells (e.g., a collagen genepromoter). Those of ordinary skill in the art will be able to easilyidentify alternative promoters that are capable of expressing a nucleicacid molecule in any of the preferred cells of the invention.

[0136] The nucleic acid sequence and the gene expression sequence aresaid to be “operably linked” when they are covalently linked in such away as to place the transcription and/or translation of the nucleic acidcoding sequence under the influence or control of the gene expressionsequence. If it is desired that the nucleic acid sequence be translatedinto a functional protein, two DNA sequences are said to be operablylinked if induction of a promoter in the 5′ gene expression sequenceresults in the transcription of the nucleic acid sequence and if thenature of the linkage between the two DNA sequences does not (1) resultin the introduction of a frame-shift mutation, (2) interfere with theability of the promoter region to direct the transcription of thenucleic acid sequence, and/or (3) interfere with the ability of thecorresponding RNA transcript to be translated into a protein. Thus, agene expression sequence would be operably linked to a nucleic acidsequence if the gene expression sequence were capable of effectingtranscription of that nucleic acid sequence such that the resultingtranscript might be translated into the desired protein or polypeptide.

[0137] The molecules of the invention can be delivered to the preferredcell types of the invention alone or in association with a vector. Inits broadest sense, a “vector” is any vehicle capable of facilitating:(1) delivery of a molecule to a target cell and/or (2) uptake of themolecule by a target cell. Preferably, the vectors transport themolecule into the target cell with reduced degradation relative to theextent of degradation that would result in the absence of the vector.Optionally, a “targeting ligand” can be attached to the vector toselectively deliver the vector to a cell which expresses on its surfacethe cognate receptor for the targeting ligand. In this manner, thevector (containing a nucleic acid or a protein) can be selectivelydelivered to a mesenchymal cell in, e.g., a joint. Methodologies fortargeting include conjugates, such as those described in U.S. Pat. No.5,391,723 to Priest. Another example of a well-known targeting vehicleis a liposome. Liposomes are commercially available from Gibco BRL.Numerous methods are published for making targeted liposomes.Preferably, the molecules of the invention are targeted for delivery tomesenchymal cells.

[0138] In general, the vectors useful in the invention include, but arenot limited to, plasmids, phagemids, viruses, other vehicles derivedfrom viral or bacterial sources that have been manipulated by theinsertion or incorporation of the nucleic acid sequences of theinvention, and additional nucleic acid fragments (e.g., enhancers,promoters) which can be attached to the nucleic acid sequences of theinvention. Viral vectors are a preferred type of vector and include, butare not limited to, nucleic acid sequences from the following viruses:adenovirus; adeno-associated virus; retrovirus, such as moloney murineleukemia virus; harvey murine sarcoma virus; murine mammary tumor virus;rouse sarcoma virus; SV40-type viruses; polyoma viruses; Epstein-Barrviruses; papilloma viruses; herpes virus; vaccinia virus; polio virus;and RNA virus such as a retrovirus. One can readily employ other vectorsnot named but known in the art.

[0139] A particularly preferred virus for certain applications is theadeno-associated virus, a double-stranded DNA virus. Theadeno-associated virus is capable of infecting a wide range of celltypes and species and can be engineered to be replication-deficient. Itfurther has advantages, such as heat and lipid solvent stability, hightransduction frequencies in cells of diverse lineages, includinghematopoictic cells, and lack of superinfection inhibition thus allowingmultiple series of transductions. Reportedly, the adeno-associated viruscan integrate into human cellular DNA in a site-specific manner, therebyminimizing the possibility of insertional mutagenesis and variability ofinserted gene expression. In addition, wild-type adeno-associated virusinfections have been followed in tissue culture for greater than 100passages in the absence of selective pressure, implying that theadeno-associated virus genomic integration is a relatively stable event.The adeno-associated virus can also function in an extrachromosomalfashion.

[0140] In general, other preferred viral vectors are based onnon-cytopathic eukaryotic viruses in which non-essential genes have beenreplaced with the gene of interest. Non-cytopathic viruses includeretroviruses, the life cycle of which involves reverse transcription ofgenomic viral RNA into DNA with subsequent proviral integration intohost cellular DNA. Adenoviruses and retroviruses have been approved forhuman gene therapy trials. In general, the retroviruses arereplication-deficient (i.e., capable of directing synthesis of thedesired proteins, but incapable of manufacturing an infectiousparticle). Such genetically altered retroviral expression vectors havegeneral utility for the high-efficiency transduction of genes in vivo.Standard protocols for producing replication-deficient retroviruses(including the steps of incorporation of exogenous genetic material intoa plasmid, transfection of a packaging cell lined with plasmid,production of recombinant retroviruses by the packaging cell line,collection of viral particles from tissue culture media, and infectionof the target cells with viral particles) are provided in Kriegler, M.,“Gene Transfer and Expression, A Laboratory Manual,” W. H. Freeman C.O.,New York (1990) and Murry, E. J. Ed. “Methods in Molecular Biology,”vol. 7, Humana Press, Inc., Cliffton, N.J. (1991).

[0141] Another preferred retroviral vector is the vector derived fromthe moloney murine leukemia virus, as described in Nabel, E. G., et al.,Science, 1990, 249:1285-1288. These vectors reportedly were effectivefor the delivery of genes to all three layers of the arterial wall,including the media. Other preferred vectors are disclosed in Flugelman,et al., Circulation, 1992, 85:1110-1117. Additional vectors that areuseful for delivering molecules of the invention are described in U.S.Pat. No. 5,674,722 by Mulligan, et. al.

[0142] In addition to the foregoing vectors, other delivery methods maybe used to deliver a molecule of the invention to a cell such as amesenchymal cell, and facilitate uptake thereby.

[0143] A preferred such delivery method of the invention is a colloidaldispersion system. Colloidal dispersion systems include lipid-basedsystems including oil-in-water emulsions, micelles, mixed micelles, andliposomes. A preferred colloidal system of the invention is a liposome.Liposomes are artificial membrane vessels which are useful as a deliveryvector in vivo or in vitro. It has been shown that large unilamellarvessels (LUV), which range in size from 0.2-4.0 μm can encapsulate largemacromolecules. RNA, DNA, and intact virions can be encapsulated withinthe aqueous interior and be delivered to cells in a biologically activeform (Fraley, et al., Trends Biochem. Sci., 1981, 6:77). In order for aliposome to be an efficient gene transfer vector, one or more of thefollowing characteristics should be present: (1) encapsulation of thegene of interest at high efficiency with retention of biologicalactivity; (2) preferential and substantial binding to a target cell incomparison to non-target cells; (3) delivery of the aqueous contents ofthe vesicle to the target cell cytoplasm at high efficiency; and (4)accurate and effective expression of genetic information.

[0144] Liposomes may be targeted to a particular tissue, such as themyocardium or the vascular cell wall, by coupling the liposome to aspecific ligand such as a monoclonal antibody, sugar, glycolipid, orprotein. Ligands which may be useful for targeting a liposome to thevascular wall include, but are not limited to the viral coat protein ofthe Hemagglutinating virus of Japan. Additionally, the vector may becoupled to a nuclear targeting peptide, which will direct the nucleicacid to the nucleus of the host cell.

[0145] Liposomes are commercially available from Gibco BRL, for example,as LIPOFECTIN™ and LIPOFECTACE™, which are formed of cationic lipidssuch as N-[1-(2,3 dioleyloxy)-propyl]-N,N,N-trimethylammonium chloride(DOTMA) and dimethyl dioctadecylammonium bromide (DDAB). Methods formaking liposomes are well known in the art and have been described inmany publications. Liposomes also have been reviewed by Gregoriadis, G.in Trends in Biotechnology, V. 3, p. 235-241 (1985). Novel liposomes forthe intracellular delivery of macromolecules, including nucleic acids,are also described in PCT International application no. PCT/US96/07572(Publication No. WO 96/40060, entitled “Intracellular Delivery ofMacromolecules”).

[0146] In one particular embodiment, the preferred vehicle is abiocompatible micro particle or implant that is suitable forimplantation into the mammalian recipient. Exemplary bioerodibleimplants that are useful in accordance with this method are described inPCT International application no. PCT/US/03307 (Publication No. WO95/24929, entitled “Polymeric Gene Delivery System”, claiming priorityto U.S. patent application Ser. No. 213,668, filed Mar. 15, 1994).PCT/US/0307 describes a biocompatible, preferably biodegradablepolymeric matrix for containing an exogenous gene under the control ofan appropriate promoter. The polymeric matrix is used to achievesustained release of the exogenous gene in the patient. In accordancewith the instant invention, the nucleic acids described herein areencapsulated or dispersed within the biocompatible, preferablybiodegradable polymeric matrix disclosed in PCT/US/03307. The polymericmatrix preferably is in the form of a micro particle such as a microsphere (wherein a nucleic acid is dispersed throughout a solid polymericmatrix) or a microcapsule (wherein a nucleic acid is stored in the coreof a polymeric shell). Other forms of the polymeric matrix forcontaining the nucleic acids of the invention include films, coatings,gels, implants, and stents. The size and composition of the polymericmatrix device is selected to result in favorable release kinetics in thetissue into which the matrix device is implanted. The size of thepolymeric matrix devise further is selected according to the method ofdelivery which is to be used, typically injection into a tissue oradministration of a suspension by aerosol into the nasal and/orpulmonary areas. The polymeric matrix composition can be selected tohave both favorable degradation rates and also to be formed of amaterial which is bioadhesive, to further increase the effectiveness oftransfer when the devise is administered to a vascular surface. Thematrix composition also can be selected not to degrade, but rather, torelease by diffusion over an extended period of time.

[0147] Both non-biodegradable and biodegradable polymeric matrices canbe used to deliver the nucleic acids of the invention to the subject.Biodegradable matrices are preferred. Such polymers may be natural orsynthetic polymers. Synthetic polymers are preferred. The polymer isselected based on the period of time over which release is desired,generally in the order of a few hours to a year or longer. Typically,release over a period ranging from between a few hours and three totwelve months is most desirable. The polymer optionally is in the formof a hydrogel that can absorb up to about 90% of its weight in water andfurther, optionally is cross-linked with multi-valent ions or otherpolymers.

[0148] In general, the nucleic acids of the invention are deliveredusing the bioerodible implant by way of diffusion, or more preferably,by degradation of the polymeric matrix. Exemplary synthetic polymerswhich can be used to form the biodegradable delivery system include:polyamides, polycarbonates, polyalkylenes, polyalkylene glycols,polyalkylene oxides, polyalkylene terepthalates, polyvinyl alcohols,polyvinyl ethers, polyvinyl esters, poly-vinyl halides,polyvinylpyrrolidone, polyglycolides, polysiloxanes, polyurethanes andco-polymers thereof, alkyl cellulose, hydroxyalkyl celluloses, celluloseethers, cellulose esters, nitro celluloses, polymers of acrylic andmethacrylic esters, methyl cellulose, ethyl cellulose, hydroxypropylcellulose, hydroxy-propyl methyl cellulose, hydroxybutyl methylcellulose, cellulose acetate, cellulose propionate, cellulose acetatebutyrate, cellulose acetate phthalate, carboxylethyl cellulose,cellulose triacetate, cellulose sulphate sodium salt, poly(methylmethacrylate), poly(ethyl methacrylate), poly(butylmethacrylate),poly(isobutyl methacrylate), poly(hexylmethacrylate), poly(isodecylmethacrylate), poly(lauryl methacrylate), poly(phenyl methacrylate),poly(methyl acrylate), poly(isopropyl acrylate), poly(isobutylacrylate), poly(octadecyl acrylate), polyethylene, polypropylene,poly(ethylene glycol), poly(ethylene oxide), poly(ethyleneterephthalate), poly(vinyl alcohols), polyvinyl acetate, poly vinylchloride, polystyrene and polyvinylpyrrolidone.

[0149] Examples of non-biodegradable polymers include ethylene vinylacetate, poly(meth) acrylic acid, polyamides, copolymers and mixturesthereof.

[0150] Examples of biodegradable polymers include synthetic polymerssuch as polymers of lactic acid and glycolic acid, polyanhydrides,poly(ortho)esters, polyurethanes, poly(butic acid), poly(valeric acid),and poly(lactide-cocaprolactone), and natural polymers such as alginateand other polysaccharides including dextran and cellulose, collagen,chemical derivatives thereof (substitutions, additions of chemicalgroups, for example, alkyl, alkylene, hydroxylations, oxidations, andother modifications routinely made by those skilled in the art), albuminand other hydrophilic proteins, zein and other prolamines andhydrophobic proteins, copolymers and mixtures thereof. In general, thesematerials degrade either by enzymatic hydrolysis or exposure to water invivo, by surface or bulk erosion.

[0151] Bioadhesive polymers of particular interest include bioerodiblehydrogels described by H. S. Sawhney, C. P. Pathak and J. A. Hubell inMacromolecules, 1993, 26, 581-587, the teachings of which areincorporated herein, polyhyaluronic acids, casein, gelatin, glutin,polyanhydrides, polyacrylic acid, alginate, chitosan, poly(methylmethacrylates), poly(ethyl methacrylates), poly(butylmethacrylate),poly(isobutyl methacrylate), poly(hexylmethacrylate), poly(isodecylmethacrylate), poly(lauryl methacrylate), poly(phenyl methacrylate),poly(methyl acrylate), poly(isopropyl acrylate), poly(isobutylacrylate), and poly(octadecyl acrylate). Thus, the invention provides acomposition of the above-described molecules of the invention for use asa medicament, methods for preparing the medicament and methods for thesustained release of the medicament in vivo.

[0152] Compaction agents also can be used in combination with a vectorof the invention. A “compaction agent”, as used herein, refers to anagent, such as a histone, that neutralizes the negative charges on thenucleic acid and thereby permits compaction of the nucleic acid into afine granule. Compaction of the nucleic acid facilitates the uptake ofthe nucleic acid by the target cell. The compaction agents can be usedalone, i.e., to deliver an isolated nucleic acid of the invention in aform that is more efficiently taken up by the cell or, more preferably,in combination with one or more of the above-described vectors.

[0153] Other exemplary compositions that can be used to facilitateuptake by a target cell of the nucleic acids of the invention includecalcium phosphate and other chemical mediators of intracellulartransport, microinjection compositions, electroporation and homologousrecombination compositions (e.g., for integrating a nucleic acid into apreselected location within the target cell chromosome).

[0154] According to another aspect of the invention, a device isprovided. The device comprises a material surface coated with an amountof an agent of the invention (i.e. an agent having mesenchymal celldifferentiation induction activity). The amount of the agent iseffective to induce mesenchymal cell differentiation in the cells ofmesenchymal origin present in the tissue to which the implantable deviceis to be implanted. In certain embodiments, the material surface is partof an implant. The material comprising the implant may be syntheticmaterial or organic tissue material. Important agents, cell-types, andso on, are as described elsewhere herein.

[0155] “Material surfaces” as used herein, include, but are not limitedto, dental and orthopedic prosthetic implants, and organic implantabletissue such as allogeneic and/or xenogeneic tissue, organ and/orvasculature.

[0156] Implantable prosthetic devices have been used in the surgicalrepair or replacement of internal tissue for many years. Orthopedicimplants include a wide variety of devices, each suited to fulfillparticular medical needs. Examples of such devices are hip jointreplacement devices, knee joint replacement devices, shoulder jointreplacement devices, and pins, braces and plates used to set fracturedbones. Some contemporary orthopedic and dental implants, use highperformance metals such as cobalt-chrome and titanium alloy to achievehigh strength. These materials are readily fabricated into the complexshapes typical of these devices using mature metal working techniquesincluding casting and machining.

[0157] In important embodiments, in addition to an agent of theinvention, the material surface may also be coated with an osteogenicprotein, a cell-growth potentiating agent, an anti-infective agent,and/or an antiinflammatory agent.

[0158] Osteogenic proteins are described elsewhere herein.

[0159] A cell-growth potentiating agent as used herein is an agent whichstimulates growth of a cell and includes growth factors such as PDGF,EGF, FGF, TGF, NGF, CNTF, and GDNF.

[0160] An anti-infectious agent as used herein is an agent which reducesthe activity of or kills a microorganism and includes: Aztreonam;Chlorhexidine Gluconate; Imidurea; Lycetamine; Nibroxane; PirazmonamSodium; Propionic Acid; Pyrithione Sodium; Sanguinarium Chloride;Tigemonam Dicholine; Acedapsone; Acetosulfone Sodium; Alamecin;Alexidine; Amdinocillin; Amdinocillin Pivoxil; Amicycline; Amifloxacin;Amifloxacin Mesylate; Amikacin; Amikacin Sulfate; Aminosalicylic acid;Aminosalicylate sodium; Amoxicillin; Amphomycin; Ampicillin; AmpicillinSodium; Apalcillin Sodium; Apramycin; Aspartocin; Astromicin Sulfate;Avilamycin; Avoparcin; Azithromycin; Azlocillin; Azlocillin Sodium;Bacampicillin Hydrochloride; Bacitracin; Bacitracin MethyleneDisalicylate; Bacitracin Zinc; Bambermycins; Benzoylpas Calcium;Berythromycin; Betamicin Sulfate; Biapenem; Biniramycin; BiphenamineHydrochloride; Bispyrithione Magsulfex; Butikacin; Butirosin Sulfate;Capreomycin Sulfate; Carbadox; Carbenicillin Disodium; CarbenicillinIndanyl Sodium; Carbenicillin Phenyl Sodium; Carbenicillin Potassium;Carumonam Sodium; Cefaclor; Cefadroxil; Cefamandole; Cefamandole Nafate;Cefamandole Sodium; Cefaparole; Cefatrizine; Cefazaflur Sodium;Cefazolin; Cefazolin Sodium; Cefbuperazone; Cefdinir; Cefepime; CefepimeHydrochloride; Cefetecol; Cefixime; Cefmenoxime Hydrochloride;Cefmetazole; Cefmetazole Sodium; Cefonicid Monosodium; Cefonicid Sodium;Cefoperazone Sodium; Ceforanide; Cefotaxime Sodium; Cefotetan; CefotetanDisodium; Cefotiam Hydrochloride; Cefoxitin; Cefoxitin Sodium;Cefpimizole; Cefpimizole Sodium; Cefpiramide; Cefpiramide Sodium;Cefpirome Sulfate; Cefpodoxime Proxetil; Cefprozil; Cefroxadine;Cefsulodin Sodium; Ceftazidime; Ceftibuten; Ceftizoxime Sodium;Ceftriaxone Sodium; Cefuroxime; Cefuroxime Axetil; Cefuroxime Pivoxetil;Cefuroxime Sodium; Cephacetrile Sodium; Cephalexin; CephalexinHydrochloride; Cephaloglycin; Cephaloridine; Cephalothin Sodium;Cephapirin Sodium; Cephradine; Cetocycline Hydrochloride; Cetophenicol;Chloramphenicol; Chloramphenicol Palmitate; Chloramphenicol PantothenateComplex; Chloramphenicol Sodium Succinate; Chlorhexidine Phosphanilate;Chloroxylenol; Chlortetracycline Bisulfate; ChlortetracyclineHydrochloride; Cinoxacin; Ciprofloxacin; Ciprofloxacin Hydrochloride;Cirolemycin; Clarithromycin; Clinafloxacin Hydrochloride; Clindamycin;Clindamycin Hydrochloride; Clindamycin Palmitate Hydrochloride;Clindamycin Phosphate; Clofazimine; Cloxacillin Benzathine; CloxacillinSodium; Cloxyquin; Colistimethate Sodium; Colistin Sulfate; Coumermycin;Coumermycin Sodium; Cyclacillin; Cycloserine; Dalfopristin; Dapsone;Daptomycin; Demeclocycline; Demeclocycline Hydrochloride; Demecycline;Denofungin; Diaveridine; Dicloxacillin; Dicloxacillin Sodium;Dihydrostreptomycin Sulfate; Dipyrithione; Dirithromycin; Doxycycline;Doxycycline Calcium; Doxycycline Fosfatex; Doxycycline Hyclate; DroxacinSodium; Enoxacin; Epicillin; Epitetracycline Hydrochloride;Erythromycin; Erythromycin Acistrate; Erythromycin Estolate;Erythromycin Ethylsuccinate; Erythromycin Gluceptate; ErythromycinLactobionate; Erythromycin Propionate; Erythromycin Stearate; EthambutolHydrochloride; Ethionamide; Fleroxacin; Floxacillin; Fludalanine;Flumequine; Fosfomycin; Fosfomycin Tromethamine; Fumoxicillin;Furazolium Chloride; Furazolium Tartrate; Fusidate Sodium; Fusidic Acid;Gentamicin Sulfate; Gloximonam; Gramicidin; Haloprogin; Hetacillin;Hetacillin Potassium; Hexedine; Ibafloxacin; Imipenem; Isoconazole;Isepamicin; Isoniazid; Josamycin; Kanamycin Sulfate; Kitasamycin;Levofuraltadone; Levopropylcillin Potassium; Lexithromycin; Lincomycin;Lincomycin Hydrochloride; Lomefloxacin; Lomefloxacin Hydrochloride;Lomefloxacin Mesylate; Loracarbef; Mafenide; Meclocycline; MeclocyclineSulfosalicylate; Megalomicin Potassium Phosphate; Mequidox; Meropenem;Methacycline; Methacycline Hydrochloride; Methenamine; MethenamineHippurate; Methenamine Mandelate; Methicillin Sodium; Metioprim;Metronidazole Hydrochloride; Metronidazole Phosphate; Mezlocillin;Mezlocillin Sodium; Minocycline; Minocycline Hydrochloride; MirincamycinHydrochloride; Monensin; Monensin Sodium; Nafcillin Sodium; NalidixateSodium; Nalidixic Acid; Natamycin; Nebramycin; Neomycin Palmitate;Neomycin Sulfate; Neomycin Undecylenate; Netilmicin Sulfate;Neutramycin; Nifuradene; Nifuraldezone; Nifuratel; Nifuratrone;Nifurdazil; Nifurimide; Nifurpirinol; Nifurquinazol; Nifurthiazole;Nitrocycline; Nitrofurantoin; Nitromide; Norfloxacin; Novobiocin Sodium;Ofloxacin; Ormetoprim; Oxacillin Sodium; Oximonam; Oximonam Sodium;Oxolinic Acid; Oxytetracycline; Oxytetracycline Calcium; OxytetracyclineHydrochloride; Paldimycin; Parachlorophenol; Paulomycin; Pefloxacin;Pefloxacin Mesylate; Penamecillin; Penicillin G Benzathine; Penicillin GPotassium; Penicillin G Procaine; Penicillin G Sodium; Penicillin V;Penicillin V Benzathine; Penicillin V Hydrabamine; Penicillin VPotassium; Pentizidone Sodium; Phenyl Aminosalicylate; PiperacillinSodium; Pirbenicillin Sodium; Piridicillin Sodium; PirlimycinHydrochloride; Pivampicillin Hydrochloride; Pivampicillin Pamoate;Pivampicillin Probenate; Polymyxin B Sulfate; Porfiromycin; Propikacin;Pyrazinamide; Pyrithione Zinc; Quindecamine Acetate; Quinupristin;Racephenicol; Ramoplanin; Ranimycin; Relomycin; Repromicin; Rifabutin;Rifametane; Rifamexil; Rifamide; Rifampin; Rifapentine; Rifaximin;Rolitetracycline; Rolitetracycline Nitrate; Rosaramicin; RosaramicinButyrate; Rosaramicin Propionate; Rosaramicin Sodium Phosphate;Rosaramicin Stearate; Rosoxacin; Roxarsone; Roxithromycin; Sancycline;Sanfetrinem Sodium; Sarmoxicillin; Sarpicillin; Scopafungin; Sisomicin;Sisomicin Sulfate; Sparfloxacin; Spectinomycin Hydrochloride;Spiramycin; Stallimycin Hydrochloride; Steffimycin; StreptomycinSulfate; Streptonicozid; Sulfabenz; Sulfabenzamide; Sulfacetamide;Sulfacetamide Sodium; Sulfacytine; Sulfadiazine; Sulfadiazine Sodium;Sulfadoxine; Sulfalene; Sulfamerazine; Sulfameter; Sulfamethazine;Sulfamethizole; Sulfamethoxazole; Sulfamonomethoxine; Sulfamoxole;Sulfanilate Zinc; Sulfanitran; Sulfasalazine; Sulfasomizole;Sulfathiazole; Sulfazamet; Sulfisoxazole; Sulfisoxazole Acetyl;Sulfisoxazole Diolamine; Sulfomyxin; Sulopenem; Sultamicillin; SuncillinSodium; Talampicillin Hydrochloride; Teicoplanin; TemafloxacinHydrochloride; Temocillin; Tetracycline; Tetracycline Hydrochloride;Tetracycline Phosphate Complex; Tetroxoprim; Thiamphenicol;Thiphencillin Potassium; Ticarcillin Cresyl Sodium; TicarcillinDisodium; Ticarcillin Monosodium; Ticlatone; Tiodonium Chloride;Tobramycin; Tobramycin Sulfate; Tosufloxacin; Trimethoprim; TrimethoprimSulfate; Trisulfapyrimidines; Troleandomycin; Trospectomycin Sulfate;Tyrothricin; Vancomycin; Vancomycin Hydrochloride; Virginiamycin;Zorbamycin; Difloxacin Hydrochloride; Lauryl Isoquinolinium Bromide;Moxalactam Disodium; Ornidazole; Pentisomicin; and SarafloxacinHydrochloride.

[0161] Anti-inflammatory agents are well known in the art and include:Alclofenac; Alclometasone Dipropionate; Algestone Acetonide; AlphaAmylase; Amcinafal; Amcinafide; Amfenac Sodium; AmipriloseHydrochloride; Anakinra; Anirolac; Anitrazafen; Apazone; BalsalazideDisodium; Bendazac; Benoxaprofen; Benzydamine Hydrochloride; Bromelains;Broperamole; Budesonide; Carprofen; Cicloprofen; Cintazone; Cliprofen;Clobetasol Propionate; Clobetasone Butyrate; Clopirac; CloticasonePropionate; Cormethasone Acetate; Cortodoxone; Deflazacort; Desonide;Desoximetasone; Dexamethasone Dipropionate; Diclofenac Potassium;Diclofenac Sodium; Diflorasone Diacetate; Diflumidone Sodium;Diflunisal; Difluprednate; Diftalone; Dimethyl Sulfoxide; Drocinonide;Endrysone; Enlimomab; Enolicam Sodium; Epirizole; Etodolac; Etofenamate;Felbinac; Fenamole; Fenbufen; Fenclofenac; Fenclorac; Fendosal;Fenpipalone; Fentiazac; Flazalone; Fluazacort; Flufenamic Acid;Flumizole; Flunisolide Acetate; Flunixin; Flunixin Meglumine; FluocortinButyl; Fluorometholone Acetate; Fluquazone; Flurbiprofen; Fluretofen;Fluticasone Propionate; Furaprofen; Furobufen; Halcinonide; HalobetasolPropionate; Halopredone Acetate; Ibufenac; Ibuprofen; IbuprofenAluminum; Ibuprofen Piconol; Ilonidap; Indomethacin; IndomethacinSodium; Indoprofen; Indoxole; Intrazole; Isoflupredone Acetate;Isoxepac; Isoxicam; Ketoprofen; Lofemizole Hydrochloride; Lornoxicam;Loteprednol Etabonate; Meclofenamate Sodium; Meclofenamic Acid;Meclorisone Dibutyrate; Mefenamic Acid; Mesalamine; Meseclazone;Methylprednisolone Suleptanate; Morniflumate; Nabumetone; Naproxen;Naproxen Sodium; Naproxol; Nimazone; Olsalazine Sodium; Orgotein;Orpanoxin; Oxaprozin; Oxyphenbutazone; Paranyline Hydrochloride;Pentosan Polysulfate Sodium; Phenbutazone Sodium Glycerate; Pirfenidone;Piroxicam; Piroxicam Cinnamate; Piroxicam Olamine; Pirprofen;Prednazate; Prifelone; Prodolic Acid; Proquazone; Proxazole; ProxazoleCitrate; Rimexolone; Romazarit; Salcolex; Salnacedin; Salsalate;Sanguinarium Chloride; Seclazone; Sermetacin; Sudoxicam; Sulindac;Suprofen; Talmetacin; Talniflumate; Talosalate; Tebufelone; Tenidap;Tenidap Sodium; Tenoxicam; Tesicam; Tesimide; Tetrydamine; Tiopinac;Tixocortol Pivalate; Tolmetin; Tolmetin Sodium; Triclonide;Triflumidate; Zidometacin; Zomepirac Sodium.

[0162] The invention also provides methods for the diagnosis and therapyof congenital and/or acquired conditions that affect the skeleton. Suchdisorders include cartilaginous tissue degeneration conditions (e.g.,all forms of arthritis including, but not limited to, osteoarthritis,rheumatoid arthritis, gout arthritis, adjuvant arthritis, arthritisdeformans, infectious arthritis, and osteochondrosis).

[0163] The methods of the invention are useful in both the acute and theprophylactic treatment of any of the foregoing conditions. As usedherein, an acute treatment refers to the treatment of subjects having aparticular condition. Prophylactic treatment refers to the treatment ofsubjects at risk of having the condition, but not presently having orexperiencing the symptoms of the condition.

[0164] In its broadest sense, the terms “treatment” or “to treat” referto both acute and prophylactic treatments. If the subject in need oftreatment is experiencing a condition (or has or is having a particularcondition), then treating the condition refers to ameliorating, reducingor eliminating the condition or one or more symptoms arising from thecondition. In some preferred embodiments, treating the condition refersto ameliorating, reducing or eliminating a specific symptom or aspecific subset of symptoms associated with the condition. If thesubject in need of treatment is one who is at risk of having acondition, then treating the subject refers to reducing the risk of thesubject having the condition.

[0165] The mode of administration and dosage of a therapeutic agent ofthe invention will vary with the particular stage of the condition beingtreated, the age and physical condition of the subject being treated,the duration of the treatment, the nature of the concurrent therapy (ifany), the specific route of administration, and the like factors withinthe knowledge and expertise of the health practitioner.

[0166] As described herein, the agents of the invention are administeredin effective amounts to treat any of the foregoing skeletal degenerationconditions. In general, an effective amount is any amount that can causea beneficial change in a desired tissue of a subject. Preferably, aneffective amount is that amount sufficient to cause a favorablephenotypic change in a particular condition such as a lessening,alleviation or elimination of a symptom or of a condition as a whole.

[0167] In general, an effective amount is that amount of apharmaceutical preparation that alone, or together with further doses,produces the desired response. This may involve only slowing theprogression of the condition temporarily, although more preferably, itinvolves halting the progression of the condition permanently ordelaying the onset of or preventing the condition from occurring. Thiscan be monitored by routine methods. Generally, doses of activecompounds would be from about 0.01 mg/kg per day to 1000 mg/kg per day.It is expected that doses ranging from 50-500 mg/kg will be suitable,preferably orally and in one or several administrations per day.

[0168] Such amounts will depend, of course, on the particular conditionbeing treated, the severity of the condition, the individual patientparameters including age, physical condition, size and weight, theduration of the treatment, the nature of concurrent therapy (if any),the specific route of administration and like factors within theknowledge and expertise of the health practitioner. Lower doses willresult from certain forms of administration, such as intravenousadministration. In the event that a response in a subject isinsufficient at the initial doses applied, higher doses (or effectivelyhigher doses by a different, more localized delivery route) may beemployed to the extent that patient tolerance permits. Multiple dosesper day are contemplated to achieve appropriate systemic levels ofcompounds. It is preferred generally that a maximum dose be used, thatis, the highest safe dose according to sound medical judgment. It willbe understood by those of ordinary skill in the art, however, that apatient may insist upon a lower dose or tolerable dose for medicalreasons, psychological reasons or for virtually any other reasons.

[0169] The agents of the invention may be combined, optionally, with apharmaceutically-acceptable carrier to form a pharmaceuticalpreparation. The term “pharmaceutically-acceptable carrier” as usedherein means one or more compatible solid or liquid fillers, diluents orencapsulating substances which are suitable for administration into ahuman. The term “carrier” denotes an organic or inorganic ingredient,natural or synthetic, with which the active ingredient is combined tofacilitate the application. The components of the pharmaceuticalcompositions also are capable of being co-mingled with the molecules ofthe present invention, and with each other, in a manner such that thereis no interaction which would substantially impair the desiredpharmaceutical efficacy. In some aspects, the pharmaceuticalpreparations comprise an agent of the invention in an amount effectiveto treat a disorder.

[0170] The pharmaceutical preparations may contain suitable bufferingagents, including: acetic acid in a salt; citric acid in a salt; boricacid in a salt; or phosphoric acid in a salt. The pharmaceuticalcompositions also may contain, optionally, suitable preservatives, suchas: benzalkonium chloride; chlorobutanol; parabens or thimerosal.

[0171] A variety of administration routes are available. The particularmode selected will depend, of course, upon the particular drug selected,the severity of the condition being treated and the dosage required fortherapeutic efficacy. The methods of the invention, generally speaking,may be practiced using any mode of administration that is medicallyacceptable, meaning any mode that produces effective levels of theactive compounds without causing clinically unacceptable adverseeffects. Such modes of administration include oral, rectal, topical,nasal, intradermal, transdermal, or parenteral routes. The term“parenteral” includes subcutaneous, intravenous, intramuscular, orinfusion. Intravenous or intramuscular routes are not particularlysuitable for long-term therapy and prophylaxis. As an example,pharmaceutical compositions may be formulated in a variety of differentways and for a variety of administration modes including tablets,capsules, powders, suppositories, injections and nasal sprays. Apreferred mode of administration is a local, site-specificadministration to the tissue location in need of repair.

[0172] The pharmaceutical preparations may conveniently be presented inunit dosage form and may be prepared by any of the methods well-known inthe art of pharmacy. All methods include the step of bringing the activeagent into association with a carrier which constitutes one or moreaccessory ingredients. In general, the compositions are prepared byuniformly and intimately bringing the active compound into associationwith a liquid carrier, a finely divided solid carrier, or both, andthen, if necessary, shaping the product.

[0173] Compositions suitable for oral administration may be presented asdiscrete units, such as capsules, tablets, lozenges, each containing apredetermined amount of the active compound. Other compositions includesuspensions in aqueous liquids or non-aqueous liquids such as a syrup,elixir or an emulsion.

[0174] Compositions suitable for parenteral administration convenientlycomprise a sterile aqueous preparation of an agent of the invention,which is preferably isotonic with the blood of the recipient. Thisaqueous preparation may be formulated according to known methods usingsuitable dispersing or wetting agents and suspending agents. The sterileinjectable preparation also may be a sterile injectable solution orsuspension in a non-toxic parenterally-acceptable diluent or solvent,for example, as a solution in 1,3-butane diol. Among the acceptablevehicles and solvents that may be employed are water, Ringer's solution,and isotonic sodium chloride solution. In addition, sterile, fixed oilsare conventionally employed as a solvent or suspending medium. For thispurpose any bland fixed oil may be employed including synthetic mono-ordi-glycerides. In addition, fatty acids such as oleic acid may be usedin the preparation of injectables. Formulations suitable for oral,subcutaneous, intravenous, intramuscular, etc. administrations can befound in Remington's Pharmaceutical Sciences, Mack Publishing Co.,Easton, Pa.

[0175] The term “permit entry” of a molecule into a cell according tothe invention has the following meanings depending upon the nature ofthe molecule. For an isolated nucleic acid it is meant to describe entryof the nucleic acid through the cell membrane and into the cell nucleus,where upon the “nucleic acid transgene” can utilize the cell machineryto produce functional polypeptides encoded by the nucleic acid. By“nucleic acid transgene” it is meant to describe all of the nucleicacids of the invention with or without the associated vectors. For apolypeptide, it is meant to describe entry of the polypeptide throughthe cell membrane and into the cell cytoplasm, and if necessary,utilization of the cell cytoplasmic machinery to functionally modify thepolypeptide (e.g., to an active form).

[0176] Various techniques may be employed for introducing nucleic acidsof the invention into cells, depending on whether the nucleic acids areintroduced in vitro or in vivo in a host. Such techniques includetransfection of nucleic acid-CaPO₄ precipitates, transfection of nucleicacids associated with DEAE, transfection with a retrovirus including thenucleic acid of interest, liposome mediated transfection, and the like.For certain uses, it is preferred to target the nucleic acid toparticular cells. In such instances, a vehicle used for delivering anucleic acid of the invention into a cell (e.g., a retrovirus, or othervirus; a liposome) can have a targeting molecule attached thereto. Forexample, a molecule such as an antibody specific for a surface membraneprotein on the target cell or a ligand for a receptor on the target cellcan be bound to or incorporated within the nucleic acid deliveryvehicle. For example, where liposomes are employed to deliver thenucleic acids of the invention, proteins which bind to a surfacemembrane protein associated with endocytosis may be incorporated intothe liposome formulation for targeting and/or to facilitate uptake. Suchproteins include capsid proteins or fragments thereof tropic for aparticular cell type, antibodies for proteins which undergointernalization in cycling, proteins that target intracellularlocalization and enhance intracellular half life, and the like.Polymeric delivery systems also have been used successfully to delivernucleic acids into cells, as is known by those skilled in the art. Suchsystems even permit oral delivery of nucleic acids.

[0177] Other delivery systems can include time-release, delayed releaseor sustained release delivery systems. Such systems can avoid repeatedadministrations of an agent of the present invention, increasingconvenience to the subject and the physician. Many types of releasedelivery systems are available and known to those of ordinary skill inthe art. They include polymer base systems such aspoly(lactide-glycolide), copolyoxalates, polycaprolactones,polyesteramides, polyorthoesters, polyhydroxybutyric acid, andpolyanhydrides. Microcapsules of the foregoing polymers containing drugsare described in, for example, U.S. Pat. No. 5,075,109. Delivery systemsalso include non-polymer systems that are: lipids including sterols suchas cholesterol, cholesterol esters and fatty acids or neutral fats suchas mono- di- and tri-glycerides; hydrogel release systems; sylasticsystems; peptide based systems; wax coatings; compressed tablets usingconventional binders and excipients; partially fused implants; and thelike. Specific examples include, but are not limited to: (a) erosionalsystems in which an agent of the invention is contained in a form withina matrix such as those described in U.S. Pat. Nos. 4,452,775, 4,675,189,and 5,736,152, and (b) diffusional systems in which an active componentpermeates at a controlled rate from a polymer such as described in U.S.Pat. Nos. 3,854,480, 5,133,974 and 5,407,686. In addition, pump-basedhardware delivery systems can be used, some of which are adapted forimplantation.

[0178] Use of a long-term sustained release implant may be desirable.Long-term release, as used herein, means that the implant is constructedand arranged to deliver therapeutic levels of the active ingredient forat least 30 days, and preferably 60 days. Long-term sustained releaseimplants are well-known to those of ordinary skill in the art andinclude some of the release systems described above. Specific examplesinclude, but are not limited to, long-term sustained release implantsdescribed in U.S. Pat. No. 4,748,024, and Canadian Patent No. 1330939.

[0179] The invention also involves the administration, and in someembodiments co-administration, of agents other than the molecules of theinvention (e.g., osteogenic proteins such as Bone Morphogenetic Protein[BMP] nucleic acids and polypeptides, and/or fragments thereof) thatwhen administered in effective amounts can act cooperatively, additivelyor synergistically with a molecule of the invention to: (i) modulatemesenchymal cell differentiation induction activity, and (ii) treat anyof the conditions in which mesenchymal cell differentiation inductionactivity of a molecule of the invention is involved. Agents other thanthe molecules of the invention include osteogenic factors.

[0180] True osteogenic factors capable of inducing the above-describedcascade of events that result in cartilage/bone formation are well knownin the art. Certain of these proteins, occur in nature asdisulfide-bonded dimeric proteins, and are referred to in the art as“osteogenic” proteins, “osteoinductive” proteins, and “bonemorphogenetic” proteins. Whether naturally-occurring or syntheticallyprepared, these osteogenic proteins, when implanted in a mammaltypically in association with a substrate that allows the attachment,proliferation and differentiation of migratory cells, are capable ofinducing recruitment of accessible cells (such as chondroblasts) andstimulating their proliferation, inducing differentiation intochondrocytes and osteoblasts, and further inducing differentiation ofintermediate cartilage, vascularization, bone formation, remodeling, andfinally marrow differentiation. Those proteins are referred to asmembers of the Vgr-1/OP1 protein subfamily of the TGF-β super genefamily of structurally related proteins. Members include the proteinsdescribed in the art as OP1 (BMP-7), OP2 (BMP-8), BMP2, BMP3, BMP4,BMP5, BMP6, 60A, DPP, Vgr-1 and Vg1. See., e.g., U.S. Pat. No.5,011,691; U.S. Pat. No. 5,266,683, Ozkaynak et al. (1990) EMBO J. 9:2085-2093, Wharton et al. (1991) PNAS 88: 9214-9218), (Ozkaynak (1992)J. Biol. Chem. 267: 25220-25227 and U.S. Pat. No. 5,266,683); (Celesteet al. (1991) PNAS 87: 9843-9847); (Lyons et al. (1989) PNAS 86:4554-4558). These disclosures describe the amino acid and DNA sequences,as well as the chemical and physical characteristics of these proteins.See also (Wozney et al. (1988) Science 242: 1528-1534); BMP 9(WO93/00432, published Jan. 7, 1993); DPP (Padgett et al. (1987) Nature325: 81-84; and Vg-1 (Weeks (1987) Cell 51: 861-867).

[0181] “Co-administcring,” as used herein, refers to administeringsimultaneously two or more compounds of the invention (e.g., a nucleicacid and/or polypeptide with mesenchymal cell differentiation inductionactivity, and an agent known to be beneficial in the treatment of askeletal degeneration condition—e.g., an osteogenic protein—), as anadmixture in a single composition, or sequentially, close enough in timeso that the compounds may exert an additive or even synergistic effect,i.e., on regenerating cartilage/bone.

[0182] The invention also embraces solid-phase nucleic acid moleculearrays. The array consists essentially of a set of nucleic acidmolecules, expression products thereof, or fragments (of either thenucleic acid or the polypeptide molecule) thereof, each nucleic acidmolecule selected from the group consisting of SEQ ID NO:1-11, and13-66, fixed to a solid substrate. In some embodiments, the solid-phasearray further comprises at least one control nucleic acid molecule. Incertain embodiments, the set of nucleic acid molecules comprises atleast one, at least two, at least three, at least four, or even at leastfive nucleic acid molecules, each selected from the group consisting ofSEQ ID NO:1-11, and 13-66. In preferred embodiments, the set of nucleicacid molecules comprises a maximum number of 100 different nucleic acidmolecules. In important embodiments, the set of nucleic acid moleculescomprises a maximum number of 10 different nucleic acid molecules. Infurther important embodiments, the set of nucleic acid moleculescomprises at least one, at least two, at least three, at least four, oreven at least five nucleic acid molecules, each selected from the groupconsisting of SEQ ID NOs:1-11.

[0183] According to the invention, standard hybridization techniques ofmicroarray technology are utilized to assess patterns of nucleic acidexpression and identify nucleic acid expression. Microarray technology,which is also known by other names including: DNA chip technology, genechip technology, and solid-phase nucleic acid array technology, is wellknown to those of ordinary skill in the art and is based on, but notlimited to, obtaining an array of identified nucleic acid probes (e.g.,molecules described elsewhere herein—SEQ ID NO:1-11, and 13-66) on afixed substrate, labeling target molecules with reporter molecules(e.g., radioactive, chemiluminescent, or fluorescent tags such asfluorescein, Cye3-dUTP, or Cye5-dUTP), hybridizing target nucleic acidsto the probes, and evaluating target-probe hybridization. A probe with anucleic acid sequence that perfectly matches the target sequence will,in general, result in detection of a stronger reporter-molecule signalthan will probes with less perfect matches. Many components andtechniques utilized in nucleic acid microarray technology are presentedin The Chipping Forecast, Nature Genetics, Vol.21, January 1999, theentire contents of which is incorporated by reference herein.

[0184] According to the present invention, microarray substrates mayinclude but are not limited to glass, silica, aluminosilicates,borosilicates, metal oxides such as alumina and nickel oxide, variousclays, nitrocellulose, or nylon. In all embodiments a glass substrate ispreferred. According to the invention, probes are selected from thegroup of nucleic acids including, but not limited to: DNA, genomic DNA,cDNA, and oligonucleotides; and may be natural or synthetic.Oligonucleotide probes preferably are 20 to 25-mer oligonucleotides andDNA/cDNA probes preferably are 500 to 5000 bases in length, althoughother lengths may be used. Appropriate probe length may be determined byone of ordinary skill in the art by following art-known procedures. Inone embodiment, preferred probes are sets of two or more of the nucleicacid molecules set forth as SEQ ID NO:1-11, and 13-66. Probes may bepurified to remove contaminants using standard methods known to those ofordinary skill in the art such as gel filtration or precipitation.

[0185] In one embodiment, the microarray substrate may be coated with acompound to enhance synthesis of the probe on the substrate. Suchcompounds include, but are not limited to, oligoethylene glycols. Inanother embodiment, coupling agents or groups on the substrate can beused to covalently link the first nucleotide or oligonucleotide to thesubstrate. These agents or groups may include, but are not limited to:amino, hydroxy, bromo, and carboxy groups. These reactive groups arepreferably attached to the substrate through a hydrocarbyl radical suchas an alkylene or phenylene divalent radical, one valence positionoccupied by the chain bonding and the remaining attached to the reactivegroups. These hydrocarbyl groups may contain up to about ten carbonatoms, preferably up to about six carbon atoms. Alkylene radicals areusually preferred containing two to four carbon atoms in the principalchain. These and additional details of the process are disclosed, forexample, in U.S. Pat. No. 4,458,066, which is incorporated by referencein its entirety.

[0186] In one embodiment, probes are synthesized directly on thesubstrate in a predetermined grid pattern using methods such aslight-directed chemical synthesis, photochemical deprotection, ordelivery of nucleotide precursors to the substrate and subsequent probeproduction.

[0187] In another embodiment, the substrate may be coated with acompound to enhance binding of the probe to the substrate. Suchcompounds include, but are not limited to: polylysine, amino silanes,amino-reactive silanes (Chipping Forecast, 1999) or chromium (Gwynne andPage, 2000). In this embodiment, presynthesized probes are applied tothe substrate in a precise, predetermined volume and grid pattern,utilizing a computer-controlled robot to apply probe to the substrate ina contact-printing manner or in a non-contact manner such as ink jet orpiezo-electric delivery. Probes may be covalently linked to thesubstrate with methods that include, but are not limited to,UV-irradiation. In another embodiment probes are linked to the substratewith heat.

[0188] Targets are nucleic acids selected from the group, including butnot limited to: DNA, genomic DNA, cDNA, RNA, mRNA and may be natural orsynthetic. In all embodiments, nucleic acid molecules from subjectssuspected of developing or having a skeletal degeneration condition, arepreferred. In certain embodiments of the invention, one or more controlnucleic acid molecules are attached to the substrate. Preferably,control nucleic acid molecules allow determination of factors includingbut not limited to: nucleic acid quality and binding characteristics;reagent quality and effectiveness; hybridization success; and analysisthresholds and success. Control nucleic acids may include, but are notlimited to, expression products of genes such as housekeeping genes orfragments thereof.

[0189] To select a set of skeletal degeneration condition markers, theexpression data generated by, for example, microarray analysis of geneexpression, is preferably analyzed to determine which genes in differentcategories of patients (each category of patients being a differentskeletal degeneration disorder), are significantly differentiallyexpressed. The significance of gene expression can be determined usingPermax computer software, although any standard statistical package thatcan discriminate significant differences is expression may be used.Permax performs permutation 2-sample t-tests on large arrays of data.For high dimensional vectors of observations, the Permax softwarecomputes t-statistics for each attribute, and assesses significanceusing the permutation distribution of the maximum and minimum overallattributes. The main use is to determine the attributes (genes) that arethe most different between two groups (e.g., control healthy subject anda subject with a particular skeletal degeneration disorder), measuring“most different” using the value of the t-statistics, and theirsignificance levels.

[0190] Expression of nucleic acid molecules of the invention can also bedetermined using protein measurement methods to determine expression ofSEQ ID NO:1-11, and 13-66, e.g., by determining the expression ofpolypeptides encoded by SEQ ID NO:1-11, and 13-66, respectively.Preferred methods of specifically and quantitatively measuring proteinsinclude, but are not limited to: mass spectroscopy-based methods such assurface enhanced laser desorption ionization (SELDI; e.g., CiphergenProteinChip System), non-mass spectroscopy-based methods, andimmunohistochemistry-based methods such as 2-dimensional gelelectrophoresis.

[0191] SELDI methodology may, through procedures known to those ofordinary skill in the art, be used to vaporize microscopic amounts oftumor protein and to create a “fingerprint” of individual proteins,thereby allowing simultaneous measurement of the abundance of manyproteins in a single sample. Preferably SELDI-based assays may beutilized to characterize skeletal degeneration conditions as well asstages of such conditions. Such assays preferably include, but are notlimited to the following examples. Gene products discovered by RNAmicroarrays may be selectively measured by specific (antibody mediated)capture to the SELDI protein disc (e.g., selective SELDI). Gene productsdiscovered by protein screening (e.g., with 2-D gels), may be resolvedby “total protein SELDI” optimized to visualize those particular markersof interest from among SEQ ID NOs:1-67. Predictive models ofclassification from SELDI measurement of multiple markers from among SEQID NOs:1-67 may be utilized for the SELDI strategies.

[0192] The use of any of the foregoing microarray methods to determineexpression of any of the foregoing nucleic acids of the invention can bedone with routine methods known to those of ordinary skill in the artand the expression determined by protein measurement methods may becorrelated to predetermined levels of a marker used as a prognosticmethod for selecting treatment strategies for patients with skeletaldegeneration.

[0193] The invention will be more fully understood by reference to thefollowing examples. These examples, however, are merely intended toillustrate the embodiments of the invention and are not to be construedto limit the scope of the invention.

EXAMPLES

[0194] Introduction

[0195] Much of what is known regarding differentiation of chondroblastshas been obtained from studies on skeletal development. Inembryogenesis, mesenchymal cells condense to form cartilaginous anlagen.Several genes have been identified that regulate this process, forexample, sox9 [1], gdf5 [2], and noggin [3], but the role that thosegenes play in post-natal chondroblastic differentiation is unclear.

[0196] We previously described a novel in vitro model of inducedchondroblast differentiation [4]. We designed the collagen spongeculture system to mimic the three-dimensional (3-D) geometry and densityof subcutaneous implants of demineralized bone powder (DBP) [5]. Humandermal fibroblasts (hDF) that were cultured with DBP inthree-dimensional collagen sponges for 7 days developed a chondroblasticphenotype. Those cells produced metachromatic extracellular matrix thatcontained sulfated glycosaminoglycans [4], and they expressed RNAtranscripts for the cartilage-specific genes aggrecan and type IIcollagen [6].

[0197] The purpose of the present study was to use this novelDBP/collagen sponge culture system to identify genes that areupregulated early in the process of chondroinduction of human dermalfibroblasts.

[0198] Representational difference analysis (RDA) is a subtractivehybridization method known in the art that uses PCR to amplifydifferentially expressed genes [7]. We used RDA to identify a pool ofgenes upregulated in hDFs cultured in DBP/collagen sponges. The analysiswas performed at an early timepoint (3 days), prior to expression of thechondroblast phenotype. Upregulation of those genes was specific tocellular interactions with DBP because RDA subtracted those genes whoseexpression was increased due to cell attachment to the collagen matrixof control sponges. These experiments are described in detail below andin our manuscript (Yates et al., Experimental Cell Research, 2001,265:203-211), the contents of which are expressly incorporated herein byreference.

[0199] Materials and Methods

[0200] Collagen sponges. 3-D collagen sponges were prepared frompepsin-digested bovine collagen [5]. Briefly, 250 μL of 0.5% collagensolution (Cellagen PC-5, ICN Biomedicals, Costa Mesa, Calif.) wasneutralized with 1M HEPES (pH 7.4) and 1M NaHCO₃, poured into a mold,frozen, lyophilized, then irradiated with ultraviolet light. DBP wasprepared from rat long bones [8]. Bilaminate DBP/collagen sponges wereprepared by placing a spacer of moistened paper between two layers ofcollagen, and were packed with 3 mg of DBP between the layers of thesponge. Control sponges consisted of a single layer of collagen.

[0201] Cells and cell seeding. Human dermal fibroblasts were obtainedfrom discarded tissue under an approved institutional protocol (Brighamand Women's Hospital #86-01858). Cells were isolated from neonatalforeskins by outgrowth culture, and were expanded in vitro to passage#12 prior to seeding onto DBP/collagen and control collagen sponges (10⁶cells per sponge) [4]. The sponges were cultured for 3 to 21 days.

[0202] Histology. Sponges were fixed for 24 hours in 2%paraformaldehyde, 0.1 M cacodylate buffer (pH 7.4), were rinsed in 0.1 Mcacodylate buffer, and were embedded in glycolmethacrylate (JB-4,Polysciences, Warrington, Pa.). Twenty micron-thick sections were cutand stained with 0.5% toluidine blue-O, pH 4.0 (Fisher Scientific,Pittsburgh, Pa.). The thick sections allowed visualization ofmetachromatic extracellular matrix above and below individual cells [4].

[0203] Demonstration of chondroinduction in vitro. Human dermalfibroblasts cultured in DBP/collagen and collagen sponges for 7 dayswere analyzed for metachromatic extracellular matrix by histology asabove, and for synthesis of cartilage chondroitin 4-sulfate by ELISA[4].

[0204] RNA isolation. Total RNA was extracted from cultured sponges onday 3 for representational difference analysis, and on days 3, 7, 14,and 21 for Northern blot and RT-PCR. Sponges were homogenized in Trizolreagent (Life Technologies, Inc., Grand Island, N.Y.) according to themanufacturer's instructions [6]. RNA quality was evaluated by absorbancereadings at 260 and 280 nm, and by ethidium bromide staining of RNAformaldehyde agarose gels.

[0205] Preparation of cDNA representations. Poly A⁺ mRNA was purified(Micro-Fast Track mRNA Isolation Kit, Invitrogen, Carlsbad, Calif.) from100 μg of total RNA isolated from hDFs cultured in DBP/collagen andcollagen sponges for 3 days (FIG. 2). The entire poly A+ mRNApreparation was reverse-transcribed into oligo dT-primed cDNA usingSuperscript II according to the manufacturer's instructions (LifeTechnologies, Inc.). Second strand synthesis was performed and thereactions were extracted with phenol/chloroform, were ethanolprecipitated, and were resuspended in a total volume of 20 μl. cDNAsynthesis was evaluated by gel electrophoresis of 2 μl of the reaction.The profiles of the two cDNAs (DBP/collagen and collagen sponges) wereindistinguishable. Eight microliters of each cDNA was digested with DpnII restriction enzyme (New England Biolabs, Inc., Beverly Mass.).RBgl12/RBgl24 primers (RBgl12, 5′-GATCTGCGGTGA-3′(SEQ ID NO: 68), RBgl24, 5′-AGCACTCTCCAGCCTCTCACCGCA-3′(SEQ ID NO: 69)) [9] were annealed andligated to the digested cDNAs (E. coli DNA ligase, Life Technologies,Inc.). Representations were generated by PCR amplification with RBgl24primers. The representations were digested with Dpn II to remove RBgl24primers then purified using the PCR Purification Kit (Qiagen,Chatsworth, Calif.). Representations were evaluated by gelelectrophoresis and the profiles were similar for DBP/collagen andcollagen representations.

[0206] Representational difference analysis. The specific conditions forRDA were essentially as described [9], except that mung bean nucleasetreatment was omitted. All oligonucleotides were purchased from LifeTechnologies, Inc. Primer sequences were as follows [9]: JBgl12,5′-GATCTGTTCATG-3′(SEQ ID NO: 70); JBgl24,ACCGACGTCGACTATCCATGAACA-3′(SEQ ID NO: 71); NBgl12,5′-GATCTTCCCTCG-3′(SEQ ID NO: 72); NBgl24,5′-AGGCAACTGTGCTATCCGAGGGAA-3′(SEQ ID NO: 73);. Tester DNA was generatedby ligating 0.5 μg of each cDNA representation to pre-annealedJBgl12/JBgl24 primers. A molar ratio of 1:100 (tester DNA:driver DNA)was used for the initial hybridization step (67° C. for 2 days). Thehybridization reaction was diluted and used in PCR reactions with JBgl24primers to amplify tester-tester DNA hybrids. The difference products(DP) were digested with Dpn II, purified, ligated to the next set ofprimers and then used as the tester DNA in the subsequent round. Theratios of tester:driver DNA and primers used for PCR in successiverounds were as follows: round 2, 1:400, NBgl12/NBgl24; round 3, 1:4000,JBgl12/JBgl24; round 4, 1:40,000, NBgl12/NBgl24.

[0207] Difference analyses were performed to identify genes that weredifferentially expressed in hDFs cultured in DBP/collagen sponges for 3days (FIG. 1). A pool of Upregulated genes was identified by subtractingcollagen driver DNA from DBP/collagen tester DNA. A pool ofDownregulated genes was identified by subtracting DBP/collagen driverDNA from collagen tester DNA. Control difference analyses were performedwith yeast tRNA to ensure that RDA enriched differentially expressed DNAsequences.

[0208] Successive iterations of hybridization/amplification produced anumber of difference products with gel electrophoresis profiles thatwere unique to each combination of tester and driver. A loss ofdifference products was observed in the Upregulated analysis at thehighest stringency (1:40,000). Thus, difference products from the thirdround were analyzed.

[0209] DNA dot blots. One microliter of each difference product wasdot-blotted onto positively charged nylon membranes (Roche MolecularBiochemicals, USA). Non-radioactive DNA probes were generated from thepools of Upregulated and Downregulated DP using the DIG High Prime Kit(Roche Molecular Biochemicals) and were hybridized to dot blotsaccording to the manufacturer's instructions. Chemiluminescent detectionwas performed with Blocking Buffer, anti-DIG antibody and CDP-Staraccording to the manufacturer's instructions (Roche MolecularBiochemicals).

[0210] Subcloning and sequencing of difference products. Upregulated DP3was subcloned with the Topo Cloning Kit (Invitrogen). A total of 2300transformants were grown in 96-well plates. Eighty-nine individualclones were randomly selected for analysis. Plasmid minipreps wereprepared using the Wizard Plus SV Miniprep kit (Promega, Madison Wis.)and analyzed by Eco RI restriction enzyme digest (Promega) and DNAsequencing (Brigham and Women's Hospital Core DNA Sequencing Facility,Boston Mass.). Matches for DNA sequences were identified by searchingthe GenBank database [10], and novel sequences were compared to eachother with BLAST 2 Sequences [11].

[0211] Northern hybridization. Total RNA isolated from hDFs cultured incollagen and DBP/collagen sponges was subjected to electrophoresisthrough 1% agarose gels (10 μg per lane) and was blotted onto apositively-charged nylon membrane (Roche Molecular Biochemicals). Themembrane was hybridized overnight at 42° C. with rotation to purified,[³²P]-labeled DNA probes in hybridization buffer containing 50%formamide, 5×SSC, 1% SDS, 5× Denhardt's solution, and 100 μg/mldenatured herring sperm DNA. The membrane was washed (2×SSC, 0.1% SDS,25° C. for 5 minutes, twice; 0.2×SSC, 0.1% SDS, 25° C. for 5 minutes,twice; 0.2×SSC, 0.1% SDS, 42° C. for 15 minutes, twice) prior toautoradiography. The X-ray films were scanned with an Epson 1200sScanner with a transparency adapter and the images were analyzed withScion Image software (Scion Corporation, Frederick, Md.). The vigilinprobe was an RDA-identified fragment that contains a portion of thecarboxy-terminal protein coding sequence. Vigilin gene expression levelswere normalized to total RNA (18S rRNA oligonucleotide, Ambion, Inc.,Austin Tex.).

[0212] RT-PCR. Total RNA from hDFs cultured in DBP/collagen and controlcollagen sponges was diluted to 100 ng/ml and treated with DNase I(Roche Molecular Biochemicals, USA) to eliminate any contaminatinggenomic DNA. Two μg of DNase-treated RNA were used in randomhexamer-primed cDNA synthesis according to the manufacturer'sinstructions (Superscript II, Life Technologies, Inc). PCR primersspecific for difference product DNA sequences were designed using thePrimer3 program [12]. Primer sequences were as follows: COL11A1,5′-GCTGCTCAAGCTCAGAAACC-3′(SEQ ID NO: 74),5′-CCCTGCCGTCTATTTCTTTG-3′(SEQ ID NO: 75); α-11 integrin,5′-TAGTAGCTGGGGCAGCAAA-3′(SEQ ID NO: 76), 5′-TGGAAGCTCGGCTTCTTTAG-3′(SEQID NO: 77); FGF2, 5′-ACAAAAGCCTTGAGGATTGC-3′(SEQ ID NO: 78),5′-AAAACTGCCGTTGGCATTAG-3′(SEQ ID NO: 79);. PCR primers specific for thecartilage matrix gene aggrecan [6] and the housekeeping geneglyceraldehyde-3-phosphate dehydrogenase (G3PDH) [13] were as described.The cycling conditions for each primer pair were determined in PCRreactions that used the corresponding RDA product as a template. Cyclingconditions were as follows: COL11A1: 94° C. for 5 min; 94° C. for 45sec, 55° C. for 45 sec, 72° C. for 2 min (35 cycles); 2 min at 72° C.α-11 integrin and FGF2: 94° C. for 5 min; 94° C. for 1 min, 55° C. for 2min, 72° C. for 3 min (40 cycles); 10 min at 72° C. Aggrecan and G3PDH:94° C. for 5 min; 94° C. for 45 sec, 60° C. for 45 sec, 72° C. for 2 min(35 cycles); 72° C. for 2 min. The primers were used in PCR reactionswith cDNA from hDFs cultured in DBP/collagen sponges for 3 days, and theresulting PCR products were subcloned and sequenced to ensure that thedesired gene had been amplified.

[0213] For kinetic gene expression analysis by RT-PCR, 1 μl of cDNA (theequivalent of 50 ng total RNA) was used in each PCR reaction. Eight μlof each PCR reaction was subjected to electrophoresis on 2% agarosegels. Photographs of ethidium bromide-stained gels were scanned with anEpson 1200s Scanner and the images were analyzed with Scion Imagesoftware. Gene expression levels were normalized to G3PDH.

[0214] Results

[0215] This analysis was designed to identify a pool of genesupregulated early in hDFs exposed to DBP in collagen sponges, prior tothe expression of cartilage extracellular matrix. Histologic evaluationof human dermal fibroblasts cultured in control collagen sponges for 3days revealed that cells were distributed throughout the lattice andwere attached along and across collagen fibers. In the DBP/collagensponges, many hDFs were attached to the collagen lattice at 3 days;those cells that had migrated into the packet of DBP were attached toand between the particles of DBP. After 3 days, no metachromaticextracellular matrix was observed in either the control collagen or theDBP/collagen sponges. Metachromatic matrix was visible, however, inDBP/collagen sponges after 7 days. In addition, biochemical analysis ofchondroitin 4-sulfate content showed 20% more in DBP/collagen sponges(265+/−19 ng/sponge) than control collagen sponges (222+/−24 ng/sponge)after 7 days in culture (n=6, p<0.01). Three days was therefore taken torepresent a time point at which early interactions were occurringbetween the cells and DBP and was chosen for analysis of differentiallyexpressed genes.

[0216] Representational difference analysis (RDA) is a PCR-based methodof subtractive hybridization in which differentially expressed cDNAs areamplified [Hubank and Schatz 1999]. We used RDA to identify pools ofUpregulated and Downregulated genes in hDFs cultured in DBP/collagensponges for 3 days. The uniqueness of the DNA sequences present in eachpool was confirmed by dot blot. Upregulated difference products (DP) didnot hybridize with the Downregulated DP, but did hybridize with self.Similarly, the Downregulated DP did not hybridize with the UpregulatedDP, but hybridized with self. Control analyses should containessentially all amplifiable sequences within the original collagen orDBP/collagen representation. As expected, difference products from theUpregulated and Downregulated analyses hybridized with differenceproducts from the corresponding Control analysis. That the UpregulatedDP also hybridized with the Downregulated Control DP and vice versaindicates that at least some of the genes identified as Upregulated wereinitially present in both representations (as opposed to de novotranscription).

[0217] Of 97 Upregulated clones that were randomly selected foranalysis, 6 clones did not contain insert, 14 clones contained 11 novelsequences (SEQ ID NOs:1-11), and 77 clones matched DNA sequencesdeposited in GenBank. Sixty of the latter clones corresponded to 49mRNAs (Table 1). The additional 17 clones corresponded to 6 GenBanksequences with unknown gene product function (Table 2).

[0218] The kinetics of vigilin expression in hDFs cultured inDBP/collagen and collagen sponges were discerned by Northernhybridization. Vigilin was selected from the Upregulated genes becauseits expression had been reported to decrease with time in culturedprimary fibroblasts [14]. Three different-sized messages were detected.The 4.5- and 6.0-kb transcripts were of a size as previously reportedmRNAs in human tissue [15]. An approximately 8.0-kb transcript was alsodetected, which likely represents an alternatively spliced message[14-16]. The total increase in vigilin transcript (relative to monolayerculture) was 5.6-fold. The majority of this increase (4.7-fold) was dueto upregulation of the 8.0-kb transcript. The 8.0-kb vigilin transcriptwas also elevated 2.0-fold after 7 days in DBP/collagen sponges. Incontrast, vigilin RNA levels in the control collagen sponge did notexceed 2.0-fold over monolayer culture and the levels of the individualtranscripts remained relatively constant.

[0219] The above gene expression analysis shows that vigilin istransiently upregulated in hDFs cultured in DBP/collagen sponges. Tocompare this transient pattern to gene expression during cartilagematrix production (chondrogenesis), we analyzed sponge samples forexpression of cartilage signature genes identified by RDA (Table 1).Several of the genes identified as Upregulated have been previouslydescribed in the context of chondrocytes or cartilage tissues. RT-PCRwas used to characterize the kinetics of their expression in hDFscultured in DBP/collagen and collagen sponges. After 3 days in culture,there was 3.0-fold more type XI collagen (COL11A1) mRNA in DBP/collagensponges than control collagen sponges (FIG. 3). Expression was maximalon days 7 and 14, and declined thereafter. Similarly, on day 3, therewas 2.8-fold more α-11 integrin mRNA in DBP/collagen sponges than incontrol collagen sponges (FIG. 3). Expression of that gene was maximalon day 14. FGF2 expression was maximal on day 14 (FIG. 3).

[0220] These kinetic analyses show two different patterns of expressionfor genes that were identified by RDA as Upregulated in hDFs cultured inDBP/collagen sponges. These patterns, transient or intermediate, aredistinct from the expression of cartilage extracellular matrix genes. Asan example of an abundant cartilage matrix gene, expression of aggrecanmRNA was analyzed. Aggrecan was not identified as Upregulated in hDFscultured in DBP/collagen sponges for 3 days. As expected, an increase inexpression of aggrecan mRNA in DBP/collagen sponges was observed after 7days and continued to increase at later timepoints (FIG. 3).

[0221] Discussion

[0222] Treatment options for damaged articular cartilage are limitedbecause of that tissue's poor capacity for repair. Possible approachesto this problem are to stimulate cartilage matrix production in situ orto engineer replacement tissue. Both of these approaches would benefitfrom a clearer understanding of the molecular mechanisms of chondroblastdifferentiation. Demineralized bone induces endochondral bone formationin vivo [17], is available through regional bone banks, and is used inhumans for orthopedic [18], oral and maxillofacial [19], and handproblems [20]. As an endochondral process, DBP-induced cartilage becomescalcified and replaced with bone, but the cartilage phase can beprolonged by hypocalcemia and anti-angiogenic factors [21]. An in vitroanalysis of early cellular effects of interaction with demineralizedbone may reveal information regarding the mechanisms of inducedchondrogenesis in post-natal mesenchymal cells.

[0223] Representational difference analysis was used to identify a poolof genes that were upregulated during chondroinduction of human dermalfibroblasts in a DBP/collagen sponge culture system. The upregulation ofgenes was specific to cellular interactions with DBP because RDAsubtracted those genes whose expression was increased due to cellattachment to the collagen matrix of control sponges. Those genesrepresented several functional classes, including protein synthesis andtrafficking, transcriptional regulation, and extracellular andcytoskeletal elements.

[0224] The expression pattern of several genes known to be expressed in,or to have an effect on cartilage tissues was characterized in hDFscultured in DBP/collagen sponges. Vigilin [P14, 15] and type XI collagen[22] are expressed in articular cartilage. α-11 integrin expression hasbeen observed in chondrosarcoma [23]. FGF2 has multiple actions onchondrocytes [rev. in 24]. A DBP-induced increase in expression of thesegenes was confirmed by Northern blot and RT-PCR. Kinetic analysis ofgene expression showed two patterns of expression—transient andintermediate—for genes that were identified as Upregulated on day 3. Incontrast, kinetic analysis of an abundant cartilage matrix gene,aggrecan, showed a later increase in gene expression.

[0225] Upregulation of several genes is consistent with an increase inprotein synthesis and export, as would be expected in cells undergoingchondroblastic differentiation. Tryptophanyl tRNA synthetase catalyzesthe attachment of tryptophan to its tRNA [25]. Exportin-t [26] andvigilin [27] have been implicated in tRNA export from the nucleus. Sec23[28] is present in a multiprotein complex that functions in selectivetransport of proteins from the transitional endoplasmic reticulum to thecis golgi [29].

[0226] Two of these gene products have documented roles intranscriptional regulation. TRAX and translin are part of a nuclearcomplex that binds the Egr response element in a strand-specific manner[T30]. TRAX contains a nuclear localization signal that probablyfunctions to transport TRAX and its binding partner, translin (whichlacks a nuclear localization signal), to the nucleus [31]. Chromodomainhelicase DNA binding protein 4 (CHD4, also known as Mi-2β) [32] ispresent in protein complexes that activate or repress transcription viaan ATP-dependent mechanism or histone deacetylase activity, respectively[33-35]. Upregulation of TRAX and CHD4 implies that changes in chromatinstructure occur to permit silencing of some genes (fibroblast-specific)and expression of others (chondroblast-specific).

[0227] Others of the Upregulated gene products are known to stabilizemRNA associations with the cytoskeleton, which is important for theestablishment of cell polarity [36]. Vigilin has been shown to bind 3′untranslated regions in the vitellogenin mRNA, which results instabilization of the message [37]. TRAX, in association with translinand an ATPase in the transitional endoplasmic reticulum, binds cytosolicγ-actin and is thought to function in mRNA stabilization on thecytoskeleton [38]. Fibroblasts and chondrocytes are strikingly differentin shape both in vivo and in vitro, the former being spindle-shaped, andthe latter, round.

[0228] A number of upregulated genes encode proteins that arecytoskeletal components. β1 integrin interacts via its cytoplasmic tailwith the carboxy-terminal end of ABP280 [39]. This protein, in turn,binds actin via its amino-terminus [40]. Integrin α11 [23] alsoassociates with β1 integrin [41]. The RING-finger protein, MID 1,interacts with microtubules [42].

[0229] The increase in distinctive cytoskeletal elements uponinteraction with DBP may reflect specific shape changes induced byattachment to DBP. Because a number of those proteins have beenimplicated in mechanotransduction, it is also possible that the shiftsare related to the chondroblast phenotype. ABP280 redistributed to thesurface of lamellipodia of lymphocytes after adherence to a collagenmatrix [43]. In human gingival fibroblasts, calcium-dependent assemblyof actin filaments and ABP280 recruitment (and its subsequent serinephosphorylation) was induced at the site of force application [44].Moreover, activity of stretch-activated calcium channels was decreasedupon cytoskeletal reorganization, suggesting a mechanism formechanoprotection of the cell membrane [44]. Mechanical tension on thecytoskeleton (via β1-integrin binding to extracellular matrix) has alsobeen linked to localized protein synthesis [45].

[0230] Finally, a number of extracellular matrix proteins wereidentified. Type XI collagen [46] forms cross-links with type IIcollagen fibrils in cartilage [22] and is essential for skeletaldevelopment [47]. Another fibrillar collagen, type III, is essential forsuccessful formation of type I collagen fibrils during development [48].Type VI collagen is expressed in a variety of tissues, includingcartilage [49, 50].

[0231] Taken together, the profile of upregulated genes represents avariety of cellular functions. The significance of these changes in geneexpression is that DBP appears to elicit a programmatic shift in cellphysiology of the target cells related to chondroinduction. TABLE 1Genes upregulated in human dermal fibroblasts cultured inthree-dimensional collagen sponges with demineralized bone powder for 3days. Category/Subcategory Gene (GenBank Locus) Extracellular matrixCOL3A1 (NM_000090) (SEQ ID NO: 13) COL6A3 (NM_004369.1) (SEQ ID NO: 14)COL11A1 (NM_001854) (SEQ ID NO: 15) Cytoskeleton Actin-associatedActin-binding protein 280 (NM_001456.1) (SEQ ID NO: 16) RhoGAP1(NM_004815) (SEQ ID NO: 17) Microtubule-associated MID1 (AF041210) (SEQID NO: 18) Cell adhesion β1 integrin (NM_002211) (SEQ ID NO: 19) α11integrin (AF109681) (SEQ ID NO: 20) erythroblast macrophage protein(AF084928) (SEQ ID NO: 21) Vigilin (NM_005336) (SEQ ID NO: 22)Translin-associated factor X (HSTRAXGEN) (SEQ ID NO: 23) Proteinsynthesis rRNA synthesis: RNA polymerase I, largest subunit (HSU33460)(SEQ ID NO: 24 tRNA aminoacylation: tryptophanyl tRNA synthase2(NM_015836) (SEQ ID NO: 25) tRNA export: Exportin-t (AF039022) (SEQ IDNO: 26) Vigilin (SEQ ID NO: 22) Protein trafficking: Sec23 homolog A(NM006364.1) (SEQ ID NO: 27) Transcription Translin-associated factor X(SEQ ID NO: 23) Chromodomain helicase DNA binding protein 4(NM_001273.1) (SEQ ID NO: 28) Nucleosome assembly protein (HUMNAP) (SEQID NO: 29) ID-2H Homo sapiens (HUMID2HC) (SEQ ID NO: 30) Growth factorsFibroblast growth factor 2 (NM_002006) (SEQ ID NO: 31) Insulin-likegrowth factor binding protein-3 (HSIGFBP3M) (SEQ ID NO: 32) Wnt-5a Homosapiens (HUMWNT5A) (SEQ ID NO: 33) Other Golgin A4 (NM_002078) (SEQ IDNO: 34) Multidrug resistance-associated protein (HUMMRPX) (SEQ ID NO:35) ATP-specific succinyl-CoA synthetase beta subunit (AF058953) (SEQ IDNO: 36) Aspartyl beta-hydroxylase (AF289489) (SEQ ID NO: 37) Ras-relatedGTP binding protein (AF106681) (SEQ ID NO: 38) RNF11 (AB024703) (SEQ IDNO: 39) Lysyl oxidase-like protein 2 (AF117949) (SEQ ID NO: 40)C2orf2^(ropp120) (AF177377) (SEQ ID NO: 41) Sec61 homolog (AF077032)(SEQ ID NO: 42) LYST-interacting protein LIP6 (AF141342) (SEQ ID NO: 43)Breast carcinoma amplified sequence 2 (NM_005872) (SEQ ID NO: 44)Hepatocellular carcinoma novel gene-3 protein (AF251079) (SEQ ID NO: 45)KIAA0908 (AB020715) (SEQ ID NO: 46) KIAA0294 (AB002292) (SEQ ID NO: 47)KIAA0184 (D80006) (SEQ ID NO: 48) cDNA DKFZp58611418 (AL049378) (SEQ IDNO: 49) cDNA FLJ10704 fis, clone NT2RP3000841 (AK001566) (SEQ ID NO: 50)cDNA FLJ10051 fis, clone HEMBA1001281 (AK000913) (SEQ ID NO: 51) cDNAFLJ12487 fis, clone NT2RM2000609 (AK022549) (SEQ ID NO: 52) cDNAFLJ23177 fis, clone LNG10649 (AK026830) (SEQ ID NO: 53) Decorin(XM_012239) (SEQ ID NO: 60) Lysyl Oxidase (XM_003695) (SEQ ID NO: 61)Lysyl hydroxylase 2 (XM_002844) (SEQ ID NO: 62) Prolyl 4-hydroxylase(XM_005728.2) (SEQ ID NO: 63) F-box only protein 32 (NM_058229) (SEQ IDNO: 64) Fibronectin receptor, alpha polypeptide (ITGA5) (NM_002205) (SEQID NO: 65) Ras-related GTPase (XM_003032) (SEQ ID NO: 66)Aminophospholipid-transporting ATPase (ATP10C) (AY029489) (SEQ ID NO:67)

[0232] TABLE 2 GenBank sequences upregulated by cellular interactionswith demineralized bone powder (DBP) Sequence ID (GenBank Locus)Corresponding Bases BAC GS1-99H8 (AC004010) 110,957-112,715 (SEQ ID NO:54) BAC RP11-394J1 (AC008149) 8,825-9,566 (SEQ ID NO: 55) clone 117O3(HS117O3) 119,269-119,527 (SEQ ID NO: 56) clone RP1-191N21 (HS191N21)96,784-97,438 (SEQ ID NO: 57) clone RP4-562A11 (AC006451) 65,010-65,577(SEQ ID NO: 58) clone RP11-436D10 (AL133417) 124,873-125,466 (SEQ ID NO:59)

REFERENCES

[0233] 1. Bi, W., Deng, J. M., Zhang, Z., Behringer, R. R., and deCrombrugghe, B. (1999). Sox9 is required for cartilage formation. NatureGenet. 22, 85-89.

[0234] 2. Storm, E. E., and Kingsley, D. M. (1999). GDF5 coordinatesbone and joint formation during digit development. Dev. Biol. 209,11-27, doi: dbio.1999.9241.

[0235] 3. Brunet, L. J., McMahon, J. A., McMahon, A. P., and Harland, R.M. (1998). Noggin, cartilage morphogenesis, and joint formation in themammalian skeleton. Science 280, 1455-1457.

[0236] 4. Mizuno, S., and Glowacki, J. (1996). Chondroinduction of humandermal fibroblasts by demineralized bone in three-dimensional culture.Exp. Cell Res. 227, 89-97, doi: excr.1996.0253

[0237] 5. Mizuno, S., and Glowacki, J. (1996). Three-dimensionalcomposite of demineralized bone powder and collagen for in vitroanalysis of chondroinduction of human dermal fibroblasts. Biomaterials17, 1819-1825.

[0238] 6. Glowacki, J., Yates, K., Little, G., and Mizuno, S. (1998).Induced chondroblastic differentiation of human dermal fibroblasts bythree-dimensional culture with demineralized bone matrix. Mat. Sci. Eng.C6, 199-203.

[0239] 7. Hubank, M., and Schatz, D. G. (1999). cDNA representationaldifference analysis: a sensitive and flexible method for identificationof differentially expressed genes. Meth. Enzymol. 303: 325-348.

[0240] 8. Glowacki, J. Cellular reactions to bone-derived material.(1996). Clin. Orthop. Rel. Res. 324, 47-54.

[0241] 9. Braun, B. S., Frieden, R., Lessnick, S. L., May, W. A., andDenny, C. T. (1995). Identification of target genes for the Ewing'ssarcoma EWS/FLI fusion protein by representational difference analysis.Molecular Cell. Biol. 15, 4623-4630.

[0242] 10. Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J.,Zhang, Z., Miller, W., and Lipman, D. J. (1997). Gapped BLAST andPPSI-BLAST: a new generation of protein database search programs.Nucleic Acids Res. 25, 3389-3402.

[0243] 11. Tatusova, T. A., and Madden, T. L. (1999). Blast 2sequences—a new tool for comparing protein and nucleotide sequences.FEMS Microbiol. Lett. 174, 247-250.

[0244] 12. Rozen, S., and Skaletsky, H. J. (1998). Primer3. Codeavailable athttp://www.genome.wi.mit.edu/genome_software/other/primer3.html

[0245] 13. Fuller, J. F., McAdara, J., Yaron, Y., Sakaguchi, M., Fraser,J. K., and Gasson, J. C. (1999). Characterization of HOX gene expressionduring myelopoiesis: role of HOX A5 in lineage commitment andmaturation. Blood 93, 3391-3400.

[0246] 14. Neu-Yilik, G., Zorbas, H., Gloe, T. R., Raabe, H. M.,Hopp-Christensen, T. A., and Muller, P. K. (1993). Vigilin is acytoplasmic protein. A study on its expression in primary cells and inestablished cell lines of different species. Eur. J. Biochem. 213,727-736.

[0247] 15. Plenz, G., Kugler, S., Schnittger, S., Rieder, H., Fonatsch,C., and Muller, P. K. (1994). The human vigilin gene: identification,chromosomal localization and expression pattern. Hum. Genet. 93,575-582.

[0248] 16. Kugler, S., Plenz, G., and Muller, P. K. (1996). Twoadditional 5′ exons in the human Vigilin gene distinguish it from thechicken gene and provide the structural basis for differential routes ofgene expression. Eur. J. Biochem. 238, 410-417.

[0249] 17. Reddi, A. H., and Huggins, C. B. (1972). Biochemicalsequences in the transformation of normal fibroblasts in adolescentrats. Proc. Natl. Acad. Sci. 69, 1601-05.

[0250] 18. Urist, M. R., and Dawson, E. (1981). Intertransverse processfusion with the aid of chemosterilized autolysed allogeneic bone. Clin.Orthop. Rel. Res. 154, 97-113.

[0251] 19. Glowacki, J., Kaban, L. B., Murray, J. E., Folkman, J., andMulliken, J. B. (1981). Application of the biological principle ofinduced osteogenesis for craniofacial defects. Lancet 1, 959-963.

[0252] 20. Upton, J., and Glowacki, J. (1992). Hand reconstruction withallograft demineralized bone: Twenty-six implants in twelve patients. J.Hand Surg. 17A, 704-713.

[0253] 21. Glowacki, J. (1986). Cartilage and bone repair: experimentaland clinical studies. Arthroscopy 2, 169-173.

[0254] 22. Mendler, M., Eich-Bender, S. G., Vaughan, L., Winterhalter,K. H., Bruckner, P. (1989). Cartilage contains mixed fibrils of collagentypes II, IX, and XI. J. Cell Biol. 108, 191-197.

[0255] 23. Lehnert, K., Ni, J., Leung, E., Gough, S. M., Weaver, A.,Yao, W.-P., Liu, D., Wang, S. X., Morris, C. M., and Krissansen, G. W.(1999). Cloning, sequence analysis, and chromosomal localization of thenovel human integrin □11 subunit (ITGA11). Genomics 60,179-187, doi:geno.1999.5909.

[0256] 24. Trippel, S. B. (1995). Growth factor actions on articularcartilage. J Rheumatol Suppl 43, 129-132.

[0257] 25. Jorgensen, R., Sogaard, T. M. M., Rossing, A. B., Martensen,P. M., and Justesen, J. (2000). Identification and characterization ofhuman mitrochondrial tryptophanyl-tRNA synthetase. J Biol Chem 275,16820-16826.

[0258] 26. Kutay, U., Lipowsky, G., Izaurralde, E., Bischoff, F. R.,Schwarzmaier, P., Hartmann, E., and Gorlich, D. (1998). Identificationof tRNA-specific nuclear export receptor. Molecular Cell 1, 359-369.

[0259] 27. Kruse, C., Willkomm, D. K., Grunweller, A., Vollbrandt, T.,Sommer, S., Busch, S., Pfeiffer, T., Brinkmann, J., Hartmann, R. K., andMuller, P. K. (2000). Export and transport of tRNA are coupled to amulti-protein complex. Biochem. J. 346, 107-115.

[0260] 28. Paccaud J-P, Reith W, Carpentier J-L, Ravazzola M, Amherdt M,Schekman R, and Orci, L. (1996). Cloning and functional characterizationof mammalian homologues of the COPII component Sec23. Molecular Biol.Cell 7, 1535-1546.

[0261] 29. Kuehn, M. J., Herrmann, J. M., and Schekman, R. (1998).COPII-cargo interactions direct protein sorting into ER-derivedtransport vesicles. Nature 391, 187-190.

[0262] 30. Taira, E., Finkenstadt, P. M., and Baraban, J. M. (1998).Identification of Translin and Trax as components of GS1 strand-specificDNA binding complex enriched in brain. J. Neurochem. 71, 471-477.

[0263] 31. Aoki, K., Ishida, R., and Kasai, M. (1997). Isolation andcharacterization of a cDNA encoding a Translin-like protein, TRAX. FEBSLett. 401, 109-112.

[0264] 32. Woodage, T., Basrai, M. A., Baxevanis, A. D., Hieter, P., andCollins, F. S. (1997). Characterization of the CHD family of proteins.Proc. Natl. Acad. Sci. U.S.A. 94, 11472-11477.

[0265] 33. Xue, Y., Wong, J., Moreno, G. T., Young, M. K., Cote, J., andWang, W. (1998). NURD, a novel complex with both ATP-dependentchromatin-remodeling and histone deacetylase activities. Molecular Cell2, 851-861.

[0266] 34. Zhang, Y., LeRoy, G., Seelig, H.-P., Lane, W. S., andReinberg, D. (1998). The dermatomyositis-specfic autoantigen Mi2 is acomponent of a complex containing histone deacytelase and nucleosomeremodeling activities. Cell 95, 279-289.

[0267] 35. Tong, J. K., Hassig, C. A., Schnitzler, G. R., Kingston, R.E., and Schreiber, S. L (1998). Chromatin deacetylation by anATP-dependent nucleosome remodeling complex. Nature 395, 917-921.

[0268] 36. Oleynikov, Y., and Singer, R. H. (1998). RNA localization:different zipcodes, same postman? Trends Cell. Biol. 8, 381-383.

[0269] 37. Dodson, R. E., and Shapiro, D. J. (1997). Vigilin, aubiquitous protein with 14 K homology domains, is the estrogen-induciblevitellogenin mRNA 3′-untranslated region-binding protein. J. Biol. Chem.272, 12249-12252.

[0270] 38. Wu, X.-Q., Lefrancois, S., Morales, C. R., and Hecht, N. B.(1999). Protein-protein interactions between the testis brainRNA-binding protein and the transitional endoplasmic reticulum ATPase, acytoskeletal γ Actin and Trax in male germ cells and the brain.Biochemistry 38, 11261-11270.

[0271] 39. Loo, D. T., Kanner, S. B., and Aruffo, A. (1998). Filaminbinds to the cytoplasmic domain of the β₁-Integrin. J. Biol. Chem. 273,23304-23312.

[0272] 40. Hock, R. S., Davis, G., and Speicher, D. W. (1990).Purification of human smooth muscle filamin and characterization ofstructural domains and functional sites. Biochemistry 29, 9441-9451.

[0273] 41. Velling, T., Kusche-Gullberg, M., Sejersen, T., and Gullberg,D. (1999). cDNA cloning and chromosomal localization of human alpha(11)integrin. A collagen-binding, I domain-containing, beta(1)-associatedintegrin alpha-chain present in muscle tissues. J. Biol. Chem. 274,25735-25742.

[0274] 42. Schweiger, S., Foerster, J., Lehmann, T., Suckow, V., Muller,Y. A., Walter, G., Davies, T., Porter, H., van Bokhoven, H., Lunt, P.W., Traub, P., and Ropers, H.-H. (1999). The Opitz syndrome geneproduct, MID1, associates with microtubules. Proc. Natl. Acad. Sci.U.S.A. 96, 2794-2799.

[0275] 43. Schwarzman, A. L., Singh, N., Tsiper, M., Gregori, L.,Dranovsky, A., Vitek, M. P., Glabe, C. G., St George-Hyslop, P. H., andGoldgaber, D. (1999). Endogenous presenilin 1 redistributes to thesurface of lamellipodia upon adhesion of Jurkat cells to a collagenmatrix. Proc. Natl. Acad. Sci. U.S.A. 96, 7932-7937.

[0276] 44. Glogauer, M., Arora, P., Chou, D., Janmey, P. A., Downey, G.P., and McCulloch, C. A. (1998). The role of actin-binding protein 280in integrin-dependent mechanoprotection. J. Biol. Chem. 273, 1689-1698.

[0277] 45. Chicurel, M. E., Singer, R. H., Meyer, C. J., and Ingber, D.E. (1998). Integrin binding and mechanical tension induce movement ofmRNA and ribosomes to focal adhesions. Nature 392, 730-733.

[0278] 46. Bernard, M., Yoshioka, H., Rodriguez, E., Van der Rest, M.,Kimura, T., Ninomiya, Y., Olsen, B. R., and Ramirez, F. (1988). Cloningand sequencing of pro-alpha 1 (XI) collagen cDNA demonstrates that typeXI belongs to the fibrillar class of collagens and reveals that theexpression of the gene is not restricted to cartilagenous tissue. J.Biol. Chem. 263, 17159-17166

[0279] 47. Li, Y., Lacerda, D. A., Warman, M. L., Beier, D. R.,Yoshioka, H., Ninomiya, Y., Oxford, J. T., Morris, N. P., Andrikopoulos,K., Ramirez, F., et al. (1995). A fibrillar collagen gene, Col11a1, isessential for skeletal morphogenesis. Cell 80, 423-430.

[0280] 48. Liu, X., Wu, H., Byrne, M., Krane, S., and Jaenisch, R.(1997). Type III collagen is crucial for collagen I fibrillogenesis andfor normal cardiovascular development. Proc. Natl. Acad. Sci. U.S.A. 94,1852-1856.

[0281] 49. Sherwin, A. F., Carter, D. H., Poole, C. A., Hoyland, J. A.,and Ayad, S. (1999). The distribution of type VI collagen in thedeveloping tissues of the bovine femoral head. Histochem. J. 31,623-632.

[0282] 50. Pullig, O., Weseloh, G., and Swoboda, B. (1999). Expressionof type VI collagen in normal and osteoarthritic human cartilage.Osteoarthritis Cartilage 7, 191-202

DETAILED DESCRIPTION OF THE DRAWINGS

[0283]FIG. 2. Schematic of experimental design for representationaldifference analysis. Human dermal fibroblasts (hDF) are seeded ontoDBP/collagen and control collagen sponges. After 3 days in culture, RNAis isolated and is used to generate cDNA representations of the genesexpressed at that timepoint. Ligation of short oligonucleotide primers(JBgl) to the representations creates tester DNA. No primers are addedto the representations that are used as the driver DNA. Hybridizationsare performed with the 4 combinations of tester and driver DNA shown.Those sequences that are present in the tester in excess are amplifiedby PCR with JBgl primers. Control analyses use yeast tRNA as driver sothat all DNA sequences in each tester are amplified. JBgl primers areremoved from the 1^(st) round difference products (DP1). A new set ofprimers (NBgl) are ligated and the DNA is used as tester in the nextcycle of hybridization/amplification (Round 2). Differentially expressedDNAs are enriched in subsequent rounds of hybridization andamplification.

[0284]FIG. 3. Kinetic analyses of cartilage signature genes. Geneexpression levels were analyzed by RT-PCR and normalized to G3PDH. Thecartilage signature genes type XI collagen (COL11A1), α-11 integrin, andFGF2 were revealed by RDA. Aggrecan was used as an example of anabundant cartilage extracellular matrix gene.

[0285] Equivalents

[0286] Those skilled in the art will recognize, or be able to ascertainusing no more than routine experimentation, many equivalents to thespecific embodiments of the invention described herein. Such equivalentsare intended to be encompassed by the following claims.

[0287] All references disclosed herein are incorporated by reference intheir entirety.

[0288] What is claimed is presented below and is followed by a SequenceListing.

1 79 1 578 DNA Homo sapiens 1 tatgttcaca aaggaagtac taaagagcgccatggatatt gcaccctggg ggaagctctc 60 aacaaactgg acttctcaac tgccattctggattccagaa gatttaacta cgtggtccgg 120 ctgttggagc tgatagcaaa gtcacagctcacatccctga gtggcatcgc ccaaaagaac 180 ttcatgaata ttttggaaaa agtggtactgaaaggttagc cttctctcac tctctcgccc 240 ctttttatag gcatggaggt gggcagatggattttccaat gaagtggacg tgtcattaga 300 ctttaagaca tgtgaatgga tggaaatgaataactccagc tacattttag agacattaaa 360 ttccagttcg gaaaggggta caactcatcctcatggcaaa ggtccttgaa gaccagcaaa 420 acattagact aataagggaa ctactccagaccctctacac atccttatgt acactggtcc 480 aaagaggtcg gcaagtctgt gctggtcgggaacattaaca tgtgggtgta tcggatggag 540 acgattctcc actggcagca gcagctgaacaacattcg 578 2 286 DNA Homo sapiens 2 tgttacacat actcatggtc tctcctactctcccttcatc acacctgtga cgtcttgcaa 60 tttatatttg attgtgtcat tatctaatgtattcctcctg gactctaagc tccaaatgaa 120 caaatgcttt gtctactttg ccataactgtgtccccagtg cttggcatgg gacctggtgc 180 acagtaaata catcataaat attttgtgaacgaatgaatg aatgaatgac catgattaat 240 aaaaggatat agcctgcctc agtctggtgctgataatggt ggtagt 286 3 441 DNA Homo sapiens 3 tggctgcctc agcctcccaaagtgctgggg ttacaggtgt gagccaccac acctggccta 60 aatttgcttt cttttaataagttttgggtt tgaattttaa gttgattttt aaatttttca 120 caagtatgca acataactcacatccctgaa cagtaaaggc caatgccttc agtggacaca 180 gatttaagag tgtagggagcagtgattcag gataaggtgt tggtagattt tccaaagtaa 240 ctctacctga atccctgagccagctggaca gggtcctttt atagttaaca cccttcccat 300 tgcattaaga acggtctgtttttatgtgtg actcacgtac tttccccaaa tgctatctga 360 gtacagctca gaactcaggcctattttatc cctgtatgtt ttgaatgggg tctggtatgt 420 tcagttaaat ctgaggattt t441 4 571 DNA Homo sapiens 4 aaaaaaggca gatgaaatta cttaatactc agtgttttggagagtattcc ttttagtttg 60 ttggatggct ggtttgaacg atagaaatat gcagcatgcaatatatgctt atatttcatt 120 ttaatttctg atatataatg aacatcttgg gagagatactgaatctttga tgttttttgt 180 cattgttctc aagtgcaata taacaatgta accaaatctagataatttca aagatgtcat 240 taatttagta agcctaatat aaacaaatat ttgtattatttttgttagca ggaaagagtg 300 attaagtgag gttatttacc cccaaatggt ccattctgcattgtatttca ggctgggaat 360 gaattattct ttaccagttt tgaaacactt tgaaatatcctaaggtaact tggaggctgt 420 gtagtatatc aaattaattt gctacctaat aacatagaaagtaaatatct ttgtggtcac 480 ccacattggg tgagacagaa aatgaatctg ttctaaaatttgtaatttgc taacttgatt 540 tgagttagtg aaaactgcca acttagagcc a 571 5 418DNA Homo sapiens Unsure (7)..(8) a, or c, or g, or t 5 atcaatnncatgcaaatgca ggctacaggg aagggggaac caggtacatt ctggccagtt 60 aaggaatcagagagtgttca agcttgacat cctctttcta tagatgagga attagctgag 120 taataagaaaaataattcta agtgtttctg ctcatatttt aggcagaaaa gtagcctggg 180 agaggtacatgagagtgaga gaggctagag aaacgtatgg ctgtatattt atgtttaaat 240 ttcaaacagaccaaaagcct acaccaacat gaaacattgg agctaaaggc agaaacaata 300 ccagcttaacagtcagccca gatgctaagc aaattactct cacctcatta aaagaagttc 360 cacaatcttctgcctaaagt ataattatca aaaataataa agtataccag tttgtttt 418 6 606 DNA Homosapiens 6 taacctggac aggctggggt ttctggtagt gaatgcggaa gaggacttgtgttttaagag 60 gaaggagaag tccaggtggg tcgttagtgg ccattatggt gtctcggtgctggtcaggtc 120 attggctgag actgagaaac acgtacgggg gccggcattg gcctgagtcaggccccacga 180 acctgcatct tcagggagca ttcgtccaca cccatcagac cagccccgggggagagggct 240 gggcagacct cgggccaggc tgtagttggg acccgcggct gtcaccgcagctgttagcag 300 cgaacagagt gaccgccagc ccctcccatg tccagcaggg cgtcccgccgtgtcccatgg 360 gctacacagc acagaggctg ggactgggga tgtccttgct gtgagctgggagggtcagcg 420 gaaaggcacg tgggagccgg caagtcctcg gctgccccgc aggttttcacttgcggcaaa 480 cggcaggatt tctctcttgg ctgcagacac agactccact tagttttaataaacaagagc 540 tttgaagagg atgagagcga gccaggaggc ctgcagcctg aatgactcactgggcacaag 600 aaagaa 606 7 559 DNA Homo sapiens 7 ctctgaggcg gccaccgcgcgccggccgcg ctctgcgcac aaaagccaaa cgcatccgac 60 tctctaaatg tgatttatttcttgctttga gattggagac cactttgcat tggccagggt 120 gtcttggggg cccggctggcctccgcggcc ggcgtcccct gcctccaccc tgtgcccgag 180 ggggtgtccg gtcctgcccatccgatactc tggtggaaat gtggctcttt gcagcatgta 240 cgtttctccc tgattttggttgatgcatat ttccccgttt aagtagccgt tagggcgcag 300 tatcggcagc ttgacacccaccaagcaaaa gtttcagcct ggaaaaaatg ggggggaagg 360 gtggatgaaa aggagggagagaaggtggag atggtttctt ttttctctat tttctttctt 420 tttttttttt tggtcaacagccgtttttct agttccaagt tttaaataca tggaaggaag 480 tccgggagaa ccatatgaaggagcaggagg agaggaggaa actttttttc cttcttttcc 540 aggagtagcc ggaaattaa 5598 525 DNA Homo sapiens 8 tgttcttctc caggccttag ctaaaccagg ccacctaaatacctgtcatt atctgtgatg 60 atttcgcatt aattggggtt tttgtccctt ctgttccttttcatagaggc tggcttctta 120 aacactccca atgtgtctgc tagttgctca ctgttatagatcgaagttat ggggctaggg 180 ggaaaaactg tataaaagcc ttaagtgatg aaacattttaataattaata tgaagggcgg 240 ctataaccag atacgctgtt ctaacaattc cagacctgcaggagctcatt ctgaccgaac 300 gtagccatgg ctttcaggac aacatttttt gtcagtgacttactataatc acaatgtggc 360 tacgaatagc tgtgtcaggc agaagttaaa catggaagtgaaagagaaca agtgttttcc 420 ctgatgtaat ggacaagatg tgttaagaaa tcttgttttggtatacactt agttttttta 480 gagacagggt tttgctctgt tgcccaggct ggagtgcagtggcct 525 9 642 DNA Homo sapiens 9 cacctgtctt tctgttttgg agcaggaaggggctggccta ctagacacac tctaggttat 60 gcttatttgt tttaattgtc tttctttcccccaaactaga atgaaataaa cacaaaggca 120 ggaacttctg tttattcagt ttaccacattacatccagtg ccttgaatgc acccagcatg 180 taacaggtat caataaatac gtgttcagtgactgagtgga agcatatatc catgaaactc 240 cacccggcct actcctgctt ccacgttaacggggccatgt gtggtcagct acacatctct 300 tgtcacaccc atcacctacc cggggcctgctagcgggagg aaaagcggta cctcctcctc 360 acagatggta gtggcaagct tctactgacaacaggctctt acccactact gacaacaggc 420 tcttacttcc aaaagcagtc aagaaggctcaggtgctggg ccacattccc aaacagctac 480 accatgtggg caaagagtat taaatgcatactttataacc gagaatgtgt aaacacctgc 540 ctatgtagtt atcaaacaga agatgcgcgcaggcagacac cacgatactt acatgcaagc 600 cgggaaacct actcaaggtc gttggtttggcttctcgcat tc 642 10 484 DNA Homo sapiens 10 acctggtgat gacagaatcagagtttgaac agagggagct taactgcaca gtgctgcctc 60 cactctcttc caccatcacaggcactcgtt ttgtgtacgg tcctgagact gtgggcccct 120 tgagggcagg gagcaaatctagctctcctc ctatgtctgc agaagcatct gggtacggtt 180 cgaggtgctt gcaatatatcaatgaacaaa acagagaccg ctgccctctt ggaacttaca 240 tttttatgtg ggaagatataagatgtgtct cctttattag tgtcaatatt gttgacacaa 300 agaaaaaata aatgacaccatgtgttagaa ggtaataagt gctagggaaa aaagttaaag 360 gagagaagag aaggggattaggagaggaag aggtgagttg cagttttaaa tagcatcatc 420 agatttggtc gcattggatgcatgacatgt aagcaaatca cacaatgtaa caggtgcatg 480 aaaa 484 11 459 DNA Homosapiens 11 aatgagatgc aggcggctgc gacctccaga cattccccct ttcccgctcttctccatccg 60 gtactggtag atgtgcagtg tgaagaacac gtgtgagttg cggtggtcgtcctcatcaca 120 gtcctgttgg tggctcctgc gggaggcaat ggcggcatcc aggaaaaaggcagccttctc 180 tgcggtgggg gcccgcagct cgctctggtt ctgcagctgc gtgccgtagatggggtcctc 240 acagaggtac acgcccgggg actggccgtc ctgcaggctg cccgtggccacctccgacag 300 caggtcccgc aggttctcct ccttccccca cacttccacg gcggaaacccggactgagaa 360 acgggcgccg gtcttttcct tgcgttcgtt tatgagctcg aagagccaagagatggcaca 420 gggaatggtg cccaggttct gcatggaatc atcctttcc 459 12 153 PRTHomo sapiens 12 Gly Lys Asp Asp Ser Met Gln Asn Leu Gly Thr Ile Pro CysAla Ile 1 5 10 15 Ser Trp Leu Phe Glu Leu Ile Asn Glu Arg Lys Glu LysThr Gly Ala 20 25 30 Arg Phe Ser Val Arg Val Ser Ala Val Glu Val Trp GlyLys Glu Glu 35 40 45 Asn Leu Arg Asp Leu Leu Ser Glu Val Ala Thr Gly SerLeu Gln Asp 50 55 60 Gly Gln Ser Pro Gly Val Tyr Leu Cys Glu Asp Pro IleTyr Gly Thr 65 70 75 80 Gln Leu Gln Asn Gln Ser Glu Leu Arg Ala Pro ThrAla Glu Lys Ala 85 90 95 Ala Phe Phe Leu Asp Ala Ala Ile Ala Ser Arg ArgSer His Gln Gln 100 105 110 Asp Cys Asp Glu Asp Asp His Arg Asn Ser HisVal Phe Phe Thr Leu 115 120 125 His Ile Tyr Gln Tyr Arg Met Glu Lys SerGly Lys Gly Gly Met Ser 130 135 140 Gly Gly Arg Ser Arg Leu His Leu Ile145 150 13 5489 DNA Homo sapiens 13 ggctgagttt tatgacgggc ccggtgctgaagggcaggga acaacttgat ggtgctactt 60 tgaactgctt ttcttttctc ctttttgcacaaagagtctc atgtctgata tttagacatg 120 atgagctttg tgcaaaaggg gagctggctacttctcgctc tgcttcatcc cactattatt 180 ttggcacaac aggaagctgt tgaaggaggatgttcccatc ttggtcagtc ctatgcggat 240 agagatgtct ggaagccaga accatgccaaatatgtgtct gtgactcagg atccgttctc 300 tgcgatgaca taatatgtga cgatcaagaattagactgcc ccaacccaga aattccattt 360 ggagaatgtt gtgcagtttg cccacagcctccaactgctc ctactcgccc tcctaatggt 420 caaggacctc aaggccccaa gggagatccaggccctcctg gtattcctgg gagaaatggt 480 gaccctggta ttccaggaca accagggtcccctggttctc ctggcccccc tggaatctgt 540 gaatcatgcc ctactggtcc tcagaactattctccccagt atgattcata tgatgtcaag 600 tctggagtag cagtaggagg actcgcaggctatcctggac cagctggccc cccaggccct 660 cccggtcccc ctggtacatc tggtcatcctggttcccctg gatctccagg ataccaagga 720 ccccctggtg aacctgggca agctggtccttcaggccctc caggacctcc tggtgctata 780 ggtccatctg gtcctgctgg aaaagatggagaatcaggta gacccggacg acctggagag 840 cgaggattgc ctggacctcc aggtatcaaaggtccagctg ggatacctgg attccctggt 900 atgaaaggac acagaggctt cgatggacgaaatggagaaa agggtgaaac aggtgctcct 960 ggattaaagg gtgaaaatgg tcttccaggcgaaaatggag ctcctggacc catgggtcca 1020 agaggggctc ctggtgagcg aggacggccaggacttcctg gggctgcagg tgctcggggt 1080 aatgacggtg ctcgaggcag tgatggtcaaccaggccctc ctggtcctcc tggaactgcc 1140 ggattccctg gatcccctgg tgctaagggtgaagttggac ctgcagggtc tcctggttca 1200 aatggtgccc ctggacaaag aggagaacctggacctcagg gacacgctgg tgctcaaggt 1260 cctcctggcc ctcctgggat taatggtagtcctggtggta aaggcgaaat gggtcccgct 1320 ggcattcctg gagctcctgg actgatgggagcccggggtc ctccaggacc agccggtgct 1380 aatggtgctc ctggactgcg aggtggtgcaggtgagcctg gtaagaatgg tgccaaagga 1440 gagcccggac cacgtggtga acgcggtgaggctggtattc caggtgttcc aggagctaaa 1500 ggcgaagatg gcaaggatgg atcacctggagaacctggtg caaatgggct tccaggagct 1560 gcaggagaaa ggggtgcccc tgggttccgaggacctgctg gaccaaatgg catcccagga 1620 gaaaagggtc ctgctggaga gcgtggtgctccaggccctg cagggcccag aggagctgct 1680 ggagaacctg gcagagatgg cgtccctggaggtccaggaa tgaggggcat gcccggaagt 1740 ccaggaggac caggaagtga tgggaaaccagggcctcccg gaagtcaagg agaaagtggt 1800 cgaccaggtc ctcctgggcc atctggtccccgaggtcagc ctggtgtcat gggcttcccc 1860 ggtcctaaag gaaatgatgg tgctcctggtaagaatggag aacgaggtgg ccctggagga 1920 cctggccctc agggtcctcc tggaaagaatggtgaaactg gacctcaagg acccccaggg 1980 cctactgggc ctggtggtga caaaggagacacaggacccc ctggtccaca aggattacaa 2040 ggcttgcctg gtacaggtgg tcctccaggagaaaatggaa aacctgggga accaggtcca 2100 aagggtgatg ccggtgcacc tggagctccaggaggcaagg gtgatgctgg tgcccctggt 2160 gaacgtggac ctcctggatt ggcaggggccccaggactta gaggtggagc tggtccccct 2220 ggtcccgaag gaggaaaggg tgctgctggtcctcctgggc cacctggtgc tgctggtact 2280 cctggtctgc aaggaatgcc tggagaaagaggaggtcttg gaagtcctgg tccaaagggt 2340 gacaagggtg aaccaggcgg cccaggtgctgatggtgtcc cagggaaaga tggcccaagg 2400 ggtcctactg gtcctattgg tcctcctggcccagctggcc agcctggaga taagggtgaa 2460 ggtggtgccc ccggacttcc aggtatagctggacctcgtg gtagccctgg tgagagaggt 2520 gaaactggcc ctccaggacc tgctggtttccctggtgctc ctggacagaa tggtgaacct 2580 ggtggtaaag gagaaagagg ggctccgggtgagaaaggtg aaggaggccc tcctggagtt 2640 gcaggacccc ctggaggttc tggacctgctggtcctcctg gtccccaagg tgtcaaaggt 2700 gaacgtggca gtcctggtgg acctggtgctgctggcttcc ctggtgctcg tggtcttcct 2760 ggtcctcctg gtagtaatgg taacccaggacccccaggtc ccagcggttc tccaggcaag 2820 gatgggcccc caggtcctgc gggtaacactggtgctcctg gcagccctgg agtgtctgga 2880 ccaaaaggtg atgctggcca accaggagagaagggatcgc ctggtgccca gggcccacca 2940 ggagctccag gcccacttgg gattgctgggatcactggag cacggggtct tgcaggacca 3000 ccaggcatgc caggtcctag gggaagccctggccctcagg gtgtcaaggg tgaaagtggg 3060 aaaccaggag ctaacggtct cagtggagaacgtggtcccc ctggacccca gggtcttcct 3120 ggtctggctg gtacagctgg tgaacctggaagagatggaa accctggatc agatggtctt 3180 ccaggccgag atggatctcc tggtggcaagggtgatcgtg gtgaaaatgg ctctcctggt 3240 gcccctggcg ctcctggtca tccaggcccacctggtcctg tcggtccagc tggaaagagt 3300 ggtgacagag gagaaagtgg ccctgctggccctgctggtg ctcccggtcc tgctggttcc 3360 cgaggtgctc ctggtcctca aggcccacgtggtgacaaag gtgaaacagg tgaacgtgga 3420 gctgctggca tcaaaggaca tcgaggattccctggtaatc caggtgcccc aggttctcca 3480 ggccctgctg gtcagcaggg tgcaatcggcagtccaggac ctgcaggccc cagaggacct 3540 gttggaccca gtggacctcc tggcaaagatggaaccagtg gacatccagg tcccattgga 3600 ccaccagggc ctcgaggtaa cagaggtgaaagaggatctg agggctcccc aggccaccca 3660 gggcaaccag gccctcctgg acctcctggtgcccctggtc cttgctgtgg tggtgttgga 3720 gccgctgcca ttgctgggat tggaggtgaaaaagctggcg gttttgcccc gtattatgga 3780 gatgaaccaa tggatttcaa aatcaacaccgatgagatta tgacttcact caagtctgtt 3840 aatggacaaa tagaaagcct cattagtcctgatggttctc gtaaaaaccc cgctagaaac 3900 tgcagagacc tgaaattctg ccatcctgaactcaagagtg gagaatactg ggttgaccct 3960 aaccaaggat gcaaattgga tgctatcaaggtattctgta atatggaaac tggggaaaca 4020 tgcataagtg ccaatccttt gaatgttccacggaaacact ggtggacaga ttctagtgct 4080 gagaagaaac acgtttggtt tggagagtccatggatggtg gttttcagtt tagctacggc 4140 aatcctgaac ttcctgaaga tgtccttgatgtgcagctgg cattccttcg acttctctcc 4200 agccgagctt cccagaacat cacatatcactgcaaaaata gcattgcata catggatcag 4260 gccagtggaa atgtaaagaa ggccctgaagctgatggggt caaatgaagg tgaattcaag 4320 gctgaaggaa atagcaaatt cacctacacagttctggagg atggttgcac gaaacacact 4380 ggggaatgga gcaaaacagt ctttgaatatcgaacacgca aggctgtgag actacctatt 4440 gtagatattg caccctatga cattggtggtcctgatcaag aatttggtgt ggacgttggc 4500 cctgtttgct ttttataaac caaactctatctgaaatccc aacaaaaaaa atttaactcc 4560 atatgtgttc ctcttgttct aatcttgtcaaccagtgcaa gtgaccgaca aaattccagt 4620 tatttatttc caaaatgttt ggaaacagtataatttgaca aagaaaaatg atacttctct 4680 ttttttgctg ttccaccaaa tacaattcaaatgctttttg ttttattttt ttaccaattc 4740 caatttcaaa atgtctcaat ggtgctataataaataaact tcaacactct ttatgataac 4800 aacactgtgt tatattcttt gaatcctagcccatctgcag agcaatgact gtgctcacca 4860 gtaaaagata acctttcttt ctgaaatagtcaaatacgaa attagaaaag ccctccctat 4920 tttaactacc tcaactggtc agaaacacagattgtattct atgagtccca gaagatgaaa 4980 aaaattttat acgttgataa aacttataaatttcattgat taatctcctg gaagattggt 5040 ttaaaaagaa aagtgtaatg caagaatttaaagaaatatt tttaaagcca caattatttt 5100 aatattggat atcaactgct tgtaaaggtgctcctctttt ttcttgtcat tgctggtcaa 5160 gattactaat atttgggaag gctttaaagacgcatgttat ggtgctaatg tactttcact 5220 tttaaactct agatcagaat tgttgacttgcattcagaac ataaatgcac aaaatctgta 5280 catgtctccc atcagaaaga ttcattggcatgccacaggg attctcctcc ttcatcctgt 5340 aaaggtcaac aataaaaacc aaattatggggctgcttttg tcacactagc atagagaatg 5400 tgttgaaatt taactttgta agcttgtatgtggttgttga tctttttttt ccttacagac 5460 acccataata aaatatcata ttaaaattc5489 14 10558 DNA Homo sapiens 14 cagtttggag ctcagtcttc caccaaaggccgttcagttc tcctgggctc cagcctcctg 60 caaggactgc aagagttttc ctccgcagctctgagtctcc acttttttgg tggagaaagg 120 ctgcaaaaag aaaaagagac gcagtgagtgggaaaagtat gcatcctatt caaacctaat 180 tgaatcgagg agcccaggga cacacgccttcaggtttgct caggggttca tatttggtgc 240 ttagacaaat tcaaaatgag gaaacatcggcacttgccct tagtggccgt cttttgcctc 300 tttctctcag gctttcctac aactcatgcccagcagcagc aagcagatgt caaaaatggt 360 gcggctgctg atataatatt tctagtggattcctcttgga ccattggaga ggaacatttc 420 caacttgttc gagagtttct atatgatgttgtaaaatcct tagctgtggg agaaaatgat 480 ttccattttg ctctggtcca gttcaacggaaacccacata ccgagttcct gttaaatacg 540 tatcgtacta aacaagaagt cctttctcatatttccaaca tgtcttatat tgggggaacc 600 aatcagactg gaaaaggatt agaatacataatgcaaagcc acctcaccaa ggctgctgga 660 agccgggccg gtgacggagt ccctcaggttatcgtagtgt taactgatgg acactcgaag 720 gatggccttg ctctgccctc agcggaacttaagtctgctg atgttaacgt gtttgcaatt 780 ggagttgagg atgcagatga aggagcgttaaaagaaatag caagtgaacc gctcaatatg 840 catatgttca acctagagaa ttttacctcacttcatgaca tagtaggaaa cttagtgtcc 900 tgtgtgcatt catccgtgag tccagaaagggctggggaca cggaaaccct taaagacatc 960 acagcacaag actctgctga cattattttccttattgatg gatcaaacaa caccggaagt 1020 gtcaatttcg cagtcattct cgacttccttgtaaatctcc ttgagaaact cccaattgga 1080 actcagcaga tccgagtggg ggtggtccagtttagcgatg agcccagaac catgttttcc 1140 ttggacacct actccaccaa ggcccaggttctgggtgcag tgaaagccct cgggtttgct 1200 ggtggggagt tggccaatat cggcctcgcccttgatttcg tggtggagaa ccacttcacc 1260 cgggcagggg gcagccgcgt ggaggaaggggttccccagg tgctggtcct cataagtgcc 1320 gggccttcta gtgacgagat tcgctacggggtggtagcac tgaagcaggc tagcgtgttc 1380 tcattcggcc ttggagccca ggccgcctccagggcagagc ttcagcacat agctaccgat 1440 gacaacttgg tgtttactgt cccggaattccgtagctttg gggacctcca ggagaaatta 1500 ctgccgtaca ttgttggcgt ggcccaaaggcacattgtct tgaaaccgcc aaccattgtc 1560 acacaagtca ttgaagtcaa caagagagacatagtcttcc tggtggatgg ctcatctgca 1620 ctgggactgg ccaacttcaa tgccatccgagacttcattg ctaaagtcat ccagaggctg 1680 gaaatcggac aggatcttat ccaggtggcagtggcccagt atgcagacac tgtgaggcct 1740 gaattttatt tcaataccca tccaacaaaaagggaagtca taaccgctgt gcggaaaatg 1800 aagcccctgg acggctcggc cctgtacacgggctctgctc tagactttgt tcgtaacaac 1860 ctattcacga gttcagccgg ctaccgggctgccgagggga ttcctaagct tttggtgctg 1920 atcacaggtg gtaagtccct agatgaaatcagccagcctg cccaggagct gaagagaagc 1980 agcataatgg cctttgccat tgggaacaagggtgccgatc aggctgagct ggaagagatc 2040 gctttcgact cctccctggt gttcatcccagctgagttcc gagccgcccc attgcaaggc 2100 atgctgcctg gcttgctggc acctctcaggaccctctctg gaacccctga agttcactca 2160 aacaaaagag atatcatctt tcttttggatggatcagcca acgttggaaa aaccaatttc 2220 ccttatgtgc gcgactttgt aatgaacctagttaacagcc ttgatattgg aaatgacaat 2280 attcgtgttg gtttagtgca atttagtgacactcctgtaa cggagttctc tttaaacaca 2340 taccagacca agtcagatat ccttggtcatctgaggcagc tgcagctcca gggaggttcg 2400 ggcctgaaca caggctcagc cctaagctatgtctatgcca accacttcac ggaagctggc 2460 ggcagcagga tccgtgaaca cgtgccgcagctcctgcttc tgctcacagc tgggcagtct 2520 gaggactcct atttgcaagc tgccaacgccttgacacgcg cgggcatcct gactttttgt 2580 gtgggagcta gccaggcgaa taaggcagagcttgagcaga ttgcttttaa cccaagcctg 2640 gtgtatctca tggatgattt cagctccctgccagctttgc ctcagcagct gattcagccc 2700 ctaaccacat atgttagtgg aggtgtggaggaagtaccac tcgctcagcc agagagcaag 2760 cgagacattc tgttcctctt tgacggctcagccaatcttg tgggccagtt ccctgttgtc 2820 cgtgactttc tctacaagat tatcgatgagctcaatgtga agccagaggg gacccgaatt 2880 gcggtggctc agtacagcga tgatgtcaaggtggagtccc gttttgatga gcaccagagt 2940 aagcctgaga tcctgaatct tgtgaagagaatgaagatca agacgggcaa agccctcaac 3000 ctgggctacg cgctggacta tgcacagaggtacatttttg tgaagtctgc tggcagccgg 3060 atcgaggatg gagtgcttca gttcctggtgctgctggtcg caggaaggtc atctgaccgt 3120 gtggatgggc cagcaagtaa cctgaagcagagtggggttg tgcctttcat cttccaagcc 3180 aagaacgcag accctgctga gttagagcagatcgtgctgt ctccagcgtt tatcctggct 3240 gcagagtcgc ttcccaagat tggagatcttcatccacaga tagtgaatct cttaaaatca 3300 gtgcacaacg gagcaccagc accagtttcaggtgaaaagg acgtggtgtt tctgcttgat 3360 ggctctgagg gcgtcaggag cggcttccctctgttgaaag agtttgtcca gagagtggtg 3420 gaaagcctgg atgtgggcca ggaccgggtccgcgtggccg tggtgcagta cagcgaccgg 3480 accaggcccg agttctacct gaattcatacatgaacaagc aggacgtcgt caacgctgtc 3540 cgccagctga ccctgctggg agggccgacccccaacaccg gggccgccct ggagtttgtc 3600 ctgaggaaca tcctggtcag ctctgcgggaagcaggataa cagaaggtgt gccccagctg 3660 ctgatcgtcc tcacggccga caggtctggggatgatgtgc ggaacccctc cgtggtcgtg 3720 aagaggggtg gggctgtgcc cattggcattggcatcggga acgctgacat cacagagatg 3780 cagaccatct ccttcatccc ggactttgccgtggccattc ccacctttcg ccagctgggg 3840 accgtccaac aggtcatctc tgagagggtgacccagctca cccgcgagga gctgagcagg 3900 ctgcagccgg tgttgcagcc tctaccgagcccaggtgttg gtggcaagag ggacgtggtc 3960 tttctcatcg atgggtccca aagtgccgggcctgagttcc agtacgttcg caccctcata 4020 gagaggctgg ttgactacct ggacgtgggctttgacacca cccgggtggc tgtcatccag 4080 ttcagcgatg accccaaggc ggagttcctgctgaacgccc attccagcaa ggatgaagtg 4140 cagaacgcgg tgcagcggct gaggcccaagggagggcggc agatcaacgt gggcaatgcc 4200 ctggagtacg tgtccaggaa catcttcaagaggcccctgg ggagccgcat tgaagagggc 4260 gtcccacagt tcctggtcct catctcgtctggaaagtctg acgatgaggt ggtcgtcccg 4320 gcggtggagc tcaagcagtt tggcgtggcccctttcacga tcgccaggaa cgcagaccag 4380 gaggagctgg tgaagatctc gctgagccccgaatatgtgt tctcggtgag caccttccgg 4440 gagctgccca gcctggagca gaaactgctgacgcccatca cgaccctgac ctcagagcag 4500 atccagaagc tcttagccag cactcgctatccacctccag cagttgagag tgatgctgca 4560 gacattgtct ttctgatcga cagctctgagggagttaggc cagatggctt tgcacatatt 4620 cgagattttg ttagcaggat tgttcgaagactcaacatcg gccccagtaa agtgagagtt 4680 ggggtcgtgc agttcagcaa tgatgtcttcccagaattct atctgaaaac ctacagatcc 4740 caggccccgg tgctggacgc catacggcgcctgaggctca gaggggggtc cccactgaac 4800 actggcaagg ctctcgaatt tgtggcaagaaacctctttg ttaagtctgc ggggagtcgc 4860 atagaagacg gggtgcccca acacctggtcctggtcctgg gtggaaaatc ccaggacgat 4920 gtgtccaggt tcgcccaggt gatccgttcctcgggcattg tgagtttagg ggtaggagac 4980 cggaacatcg acagaacaga gctgcagaccatcaccaatg accccagact ggtcttcaca 5040 gtgcgagagt tcagagagct tcccaacatagaagaaagaa tcatgaactc gtttggaccc 5100 tccgcagcca ctcctgcacc tccaggggtggacacccctc ctccttcacg gccagagaag 5160 aagaaagcag acattgtgtt cctgttggatggttccatca acttcaggag ggacagtttc 5220 caggaagtgc ttcgttttgt gtctgaaatagtggacacag tttatgaaga tggcgactcc 5280 atccaagtgg ggcttgtcca gtacaactctgaccccactg acgaattctt cctgaaggac 5340 ttctctacca agaggcagat tattgacgccatcaacaaag tggtctacaa agggggaaga 5400 cacgccaaca ctaaggtggg ccttgagcacctgcgggtaa accactttgt gcctgaggca 5460 ggcagccgcc tggaccagcg ggtccctcagattgcctttg tgatcacggg aggaaagtcg 5520 gtggaagatg cacaggatgt gagcctggccctcacccaga ggggggtcaa agtgtttgct 5580 gttggagtga ggaatatcga ctcggaggaggttggaaaga tagcgtccaa cagcgccaca 5640 gcgttccgcg tgggcaacgt ccaggagctgtccgaactga gcgagcaagt tttggaaact 5700 ttgcatgatg cgatgcatga aaccctttgccctggtgtaa ctgatgctgc caaagcttgt 5760 aatctggatg tgattctggg gtttgatggttctagagacc agaatgtttt tgtggcccag 5820 aagggcttcg agtccaaggt ggacgccatcttgaacagaa tcagccagat gcacagggtc 5880 agctgcagcg gtggccgctc gcccaccgtgcgtgtgtcag tggtggccaa cacgccctcg 5940 ggcccggtgg aggcctttga ctttgacgagtaccagccag agatgctcga gaagttccgg 6000 aacatgcgca gccagcaccc ctacgtcctcacggaggaca ccctgaaggt ctacctgaac 6060 aagttcagac agtcctcgcc ggacagcgtgaaggtggtca ttcattttac tgatggagca 6120 gacggagatc tggctgattt acacagagcatctgagaacc tccgccaaga aggagtccgt 6180 gccttgatcc tggtgggcct tgaacgagtggtcaacttgg agcggctaat gcatctggag 6240 tttgggcgag ggtttatgta tgacaggcccctgaggctta acttgctgga cttggattat 6300 gaactagcgg agcagcttga caacattgccgagaaagctt gctgtggggt tccctgcaag 6360 tgctctgggc agaggggaga ccgcgggcccatcggcagca tcgggccaaa gggtattcct 6420 ggagaagacg gctaccgagg ctatcctggtgatgagggtg gacccggtga gcgtggtccg 6480 cctggtgtga acggcactca aggtttccagggctgcccgg gccagagagg agtaaagggc 6540 tctcggggat tcccaggaga gaagggcgaagtaggagaaa ttggactgga tggtctggat 6600 ggtgaagatg gagacaaagg attgcctggttcttctggag agaaagggaa tcctggaaga 6660 aggggtgata aaggacctcg aggagagaaaggagaaagag gagatgttgg gattcgaggg 6720 gacccgggta acccaggaca agacagccaggagagaggac ccaaaggaga aaccggtgac 6780 ctcggcccca tgggtgtccc agggagagatggagtacctg gaggacctgg agaaactggg 6840 aagaatggtg gctttggccg aaggggaccccccggagcta agggcaacaa gggcggtcct 6900 ggccagccgg gctttgaggg agagcaggggaccagaggtg cacagggccc agctggtcct 6960 gctggtcctc cagggctgat aggagaacaaggcatttctg gacctagggg aagcggaggt 7020 gcccgtggcg ctcctggaga acgaggcagaaccggtccac tgggaagaaa gggtgagccc 7080 ggagagccag gaccaaaagg aggaatcgggaacccgggcc ctcgtgggga gacgggagat 7140 gacgggagag acggagttgg cagtgaaggacgcagaggca aaaaaggaga aagaggattt 7200 cctggatacc caggaccaaa gggtaacccaggtgaacctg ggctaaatgg aacaacagga 7260 cccaaaggca tcagaggccg aaggggaaattcgggacctc cagggatagt tggacagaag 7320 gggagacctg gctacccagg accagctggtccaaggggca acaggggcga ctccatcgat 7380 caatgtgccc tcatccaaag catcaaagataaatgccctt gctgttacgg gcccctggag 7440 tgccccgtct tcccaacaga actagcctttgctttagaca cctctgaggg agtcaaccaa 7500 gacactttcg gccggatgcg agatgtggtcttgagtattg tgaatgtcct gaccattgct 7560 gagagcaact gcccgacggg ggcccgggtggctgtggtca cctacaacaa cgaggtgacc 7620 acggagatcc ggtttgctga ctccaagaggaagtcggtcc tcctggacaa gattaagaac 7680 cttcaggtgg ctctgacatc caaacagcagagtctggaga ctgccatgtc gtttgtggcc 7740 aggaacacat ttaagcgtgt gaggaacggattcctaatga ggaaagtggc tgttttcttc 7800 agcaacacac ccacaagagc atccccacagctcagagagg ctgtgctcaa actctcagat 7860 gcggggatca cccccttgtt ccttacaaggcaggaagacc ggcagctcat caacgctttg 7920 cagatcaata acacagcagt ggggcatgcgcttgtcctgc ctgcagggag agacctcaca 7980 gacttcctgg agaatgtcct cacgtgtcatgtttgcttgg acatctgcaa catcgaccca 8040 tcctgtggat ttggcagttg gaggccttccttcagggaca ggagagcggc agggagtgat 8100 gtggacatcg acatggcttt catcttagacagcgctgaga ccaccaccct gttccagttc 8160 aatgagatga agaagtacat agcgtacctggtcagacaac tggacatgag cccagatccc 8220 aaggcctccc agcacttcgc cagagtggcagttgtgcagc acgcgccctc tgagtccgtg 8280 gacaatgcca gcatgccacc tgtgaaggtggaattctccc tgactgacta tggctccaag 8340 gagaagctgg tggacttcct cagcaggggaatgacacagt tgcagggaac cagggcctta 8400 ggcagtgcca ttgaatacac catagagaatgtctttgaaa gtgccccaaa cccacgggac 8460 ctgaaaattg tggtcctgat gctgacgggcgaggtgccgg agcagcagct ggaggaggcc 8520 cagagagtca tcctgcaggc caaatgcaagggctacttct tcgtggtcct gggcattggc 8580 aggaaggtga acatcaagga ggtatacaccttcgccagtg agccaaacga cgtcttcttc 8640 aaattagtgg acaagtccac cgagctcaacgaggagcctt tgatgcgctt cgggaggctg 8700 ttgccgtcct tcgtcagcag tgaaaatgctttttacttgt ccccagatat caggaaacag 8760 tgtgattggt tccaagggga ccaacccacaaagaaccttg tgaagtttgg tcacaaacaa 8820 gtaaatgttc cgaataacgt tacttcaagtcctacatcca acccagtgac gacaacgaag 8880 ccggtgacta cgacgaagcc ggtgaccaccacaacaaagc ctgtaaccac cacaacaaag 8940 cctgtgacta ttataaatca gccatctgtgaagccagccg ctgcaaagcc ggcccctgcg 9000 aaacctgtgg ctgccaagcc tgtggccacaaagacggcca ctgttagacc cccagtggcg 9060 gtgaagccag caacagcagc gaagcctgtagcagcaaagc cagcagctgt aagacccccc 9120 gctgctgctg caaaaccagt ggcgaccaagcctgaggtcc ctaggccaca ggcagccaaa 9180 ccagctgcca ccaagccagc caccactaagcccgtggtta agatgctccg tgaagtccag 9240 gtgtttgaga taacagagaa cagcgccaaactccactggg agaggcctga gccccccggt 9300 ccttattttt atgacctcac cgtcacctcagcccatgatc agtccctggt tctgaagcag 9360 aacctcacgg tcacggaccg cgtcattggaggcctgctcg ctgggcagac ataccatgtg 9420 gctgtggtct gctacctgag gtctcaggtcagagccacct accacggaag tttcagtaca 9480 aagaaatctc agcccccacc tccacagccagcaaggtcag cttctagttc aaccatcaat 9540 ctaatggtga gcacagaacc attggctctcactgaaacag atatatgcaa gttgccgaaa 9600 gacgaaggaa cttgcaggga tttcatattaaaatggtact atgatccaaa caccaaaagc 9660 tgtgcaagat tctggtatgg aggttgtggtggaaacgaaa acaaatttgg atcacagaaa 9720 gaatgtgaaa aggtttgcgc tcctgtgctcgccaaacccg gagtcatcag tgtgatggga 9780 acctaagcgt gggtggccaa catcatatacctcttgaaga agaaggagtc agccatcgcc 9840 aacttgtctc tgtagaagct ccgggtgtagattcccttgc actgtatcat ttcatgcttt 9900 gatttacact cgaactcggg agggaacatcctgctgcatg acctatcagt atggtgctaa 9960 tgtgtctgtg gaccctcgct ctctgtctccagcagttctc tcgaatactt tgaatgttgt 10020 gtaacagtta gccactgctg gtgtttatgtgaacattcct atcaatccaa attccctctg 10080 gagtttcatg ttatgcctgt tgcaggcaaatgtaaagtct agaaaataat gcaaatgtca 10140 cggctactct atatactttt gcttggttcattttttttcc cttttagtta agcatgactt 10200 tagatgggaa gcctgtgtat cgtggagaaacaagagacca actttttcat tccctgcccc 10260 caatttccca gactagattt caagctaattttctttttct gaagcctcta acaaatgatc 10320 tagttcagaa ggaagcaaaa tcccttaatctatgtgcacc gttgggacca atgccttaat 10380 taaagaattt aaaaaagttg taatagagaatatttttggc attcctctca atgttgtgtg 10440 tttttttttt ttgtgtgctg gagggaggggatttaatttt aattttaaaa tgtttaggaa 10500 atttatacaa agaaactttt taataaagtatattgaaagt ttaaaaaaaa aaaaaaaa 10558 15 6319 DNA Homo sapiens 15acacagtact ctcagcttgt tggtggaagc ccctcatctg ccttcattct gaaggcaggg 60cccggcagag gaaggatcag agggtcgcgg ccggagggtc ccggccggtg gggccaactc 120agagggagag gaaagggcta gagacacgaa gaacgcaaac catcaaattt agaagaaaaa 180gccctttgac tttttccccc tctccctccc caatggctgt gtagcaaaca tccctggcga 240taccttggaa aggacgaagt tggtctgcag tcgcaatttc gtgggttgag ttcacagttg 300tgagtgcggg gctcggagat ggagccgtgg tcctctaggt ggaaaacgaa acggtggctc 360tgggatttca ccgtaacaac cctcgcattg accttcctct tccaagctag agaggtcaga 420ggagctgctc cagttgatgt actaaaagca ctagattttc acaattctcc agagggaata 480tcaaaaacaa cgggattttg cacaaacaga aagaattcta aaggctcaga tactgcttac 540agagtttcaa agcaagcaca actcagtgcc ccaacaaaac agttatttcc aggtggaact 600ttcccagaag acttttcaat actatttaca gtaaaaccaa aaaaaggaat tcagtctttc 660cttttatcta tatataatga gcatggtatt cagcaaattg gtgttgaggt tgggagatca 720cctgtttttc tgtttgaaga ccacactgga aaacctgccc cagaagacta tcccctcttc 780agaactgtta acatcgctga cgggaagtgg catcgggtag caatcagcgt ggagaagaaa 840actgtgacaa tgattgttga ttgtaagaag aaaaccacga aaccacttga tagaagtgag 900agagcaattg ttgataccaa tggaatcacg gtttttggaa caaggatttt ggatgaagaa 960gtttttgagg gggacattca gcagtttttg atcacaggtg atcccaaggc agcatatgac 1020tactgtgagc attatagtcc agactgtgac tcttcagcac ccaaggctgc tcaagctcag 1080gaacctcaga tagatgagta tgcaccagag gatataatcg aatatgacta tgagtatggg 1140gaagcagagt ataaagaggc tgaaagtgta acagagggac ccactgtaac tgaggagaca 1200atagcacaga cggaggcaaa catcgttgat gattttcaag aatacaacta tggaacaatg 1260gaaagttacc agacagaagc tcctaggcat gtttctggga caaatgagcc aaatccagtt 1320gaagaaatat ttactgaaga atatctaacg ggagaggatt atgattccca gaggaaaaat 1380tctgaggata cactatatga aaacaaagaa atagacggca gggattctga tcttctggta 1440gatggagatt taggcgaata tgatttttat gaatataaag aatatgaaga taaaccaaca 1500agccccccta atgaagaatt tggtccaggt gtaccagcag aaactgatat tacagaaaca 1560agcataaatg gccatggtgc atatggagag aaaggacaga aaggagaacc agcagtggtt 1620gagcctggta tgcttgtcga aggaccacca ggaccagcag gacctgcagg tattatgggt 1680cctccaggtc tacaaggccc cactggaccc cctggtgacc ctggcgatag gggcccccca 1740ggacgtcctg gcttaccagg ggctgatggt ctacctggtc ctcctggtac tatgttgatg 1800ttaccgttcc gttatggtgg tgatggttcc aaaggaccaa ccatctctgc tcaggaagct 1860caggctcaag ctattcttca gcaggctcgg attgctctga gaggcccacc tggcccaatg 1920ggtctaactg gaagaccagg tcctgtgggg gggcctggtt catctggggc caaaggtgag 1980agtggtgatc caggtcctca gggccctcga ggcgtccagg gtccccctgg tccaacggga 2040aaacctggaa aaaggggtcg tccaggtgca gatggaggaa gaggaatgcc aggagaacct 2100ggggcaaagg gagatcgagg gtttgatgga cttccgggtc tgccaggtga caaaggtcac 2160aggggtgaac gaggtcctca aggtcctcca ggtcctcctg gtgatgatgg aatgagggga 2220gaagatggag aaattggacc aagaggtctt ccaggtgaag ctggcccacg aggtttgctg 2280ggtccaaggg gaactccagg agctccaggg cagcctggta tggcaggtgt agatggcccc 2340ccaggaccaa aagggaacat gggtccccaa ggggagcctg ggcctccagg tcaacaaggg 2400aatccaggac ctcagggtct tcctggtcca caaggtccaa ttggtcctcc tggtgaaaaa 2460ggaccacaag gaaaaccagg acttgctgga cttcctggtg ctgatgggcc tcctggtcat 2520cctgggaaag aaggccagtc tggagaaaag ggggctctgg gtccccctgg tccacaaggt 2580cctattggat acccgggccc ccggggagta aagggagcag atggtgtcag aggtctcaag 2640ggatctaaag gtgaaaaggg tgaagatggt tttccaggat tcaaaggtga catgggtcta 2700aaaggtgaca gaggagaagt tggtcaaatt ggcccaagag gggaagatgg ccctgaagga 2760cccaaaggtc gagcaggccc aactggagac ccaggtcctt caggtcaagc aggagaaaag 2820ggaaaacttg gagttccagg attaccagga tatccaggaa gacaaggtcc aaagggttcc 2880actggattcc ctgggtttcc aggtgccaat ggagagaaag gtgcacgggg agtagctggc 2940aaaccaggcc ctcggggtca gcgtggtcca acgggtcctc gaggttcaag aggtgcaaga 3000ggtcccactg ggaaacctgg gccaaagggc acttcaggtg gcgatggccc tcctggccct 3060ccaggtgaaa gaggtcctca aggacctcag ggtccagttg gattccctgg accaaaaggc 3120cctcctggac caccaggaag gatgggctgc ccaggacacc ctgggcaacg tggggagact 3180ggatttcaag gcaagaccgg ccctcctggg ccagggggag tggttggacc acagggacca 3240accggtgaga ctggtccaat aggggaacgt gggcatcctg gccctcctgg ccctcctggt 3300gagcaaggtc ttcctggtgc tgcaggaaaa gaaggtgcaa agggtgatcc aggtcctcaa 3360ggtatctcag ggaaagatgg accagcagga ttacgtggtt tcccagggga aagaggtctt 3420cctggagctc agggtgcacc tggactgaaa ggaggggaag gtccccaggg cccaccaggt 3480ccagttggct caccaggaga acgtgggtca gcaggtacag ctggcccaat tggtttacca 3540gggcgcccgg gacctcaggg tcctcctggt ccagctggag agaaaggtgc tcctggagaa 3600aaaggtcccc aagggcctgc agggagagat ggagttcaag gtcctgttgg tctcccaggg 3660ccagctggtc ctgccggctc ccctggggaa gacggagaca agggtgaaat tggtgagccg 3720ggacaaaaag gcagcaaggg tgacaaggga gaaaatggcc ctcccggtcc cccaggtctt 3780caaggaccag ttggtgcccc tggaattgct ggaggtgatg gtgaaccagg tcctagagga 3840cagcagggga tgtttgggca aaaaggtgat gagggtgcca gaggcttccc tggacctcct 3900ggtccaatag gtcttcaggg tctgccaggc ccacctggtg aaaaaggtga aaatggggat 3960gttggtccat gggggccacc tggtcctcca ggcccaagag gccctcaagg tcccaatgga 4020gctgatggac cacaaggacc cccaggttct gttggttcag ttggtggtgt tggagaaaag 4080ggtgaacctg gagaagcagg aaacccaggg cctcctgggg aagcaggtgt aggcggtccc 4140aaaggagaaa gaggagagaa aggggaagct ggtccacctg gagctgctgg acctccaggt 4200gccaaggggc cgccaggtga tgatggccct aagggtaacc cgggtcctgt tggttttcct 4260ggagatcctg gtcctcctgg ggaacttggc cctgcaggtc aagatggtgt tggtggtgac 4320aagggtgaag atggagatcc tggtcaaccg ggtcctcctg gcccatctgg tgaggctggc 4380ccaccaggtc ctcctggaaa acgaggtcct cctggagctg caggtgcaga gggaagacaa 4440ggtgaaaaag gtgctaaggg ggaagcaggt gcagaaggtc ctcctggaaa aaccggccca 4500gtcggtcctc agggacctgc aggaaagcct ggtccagaag gtcttcgggg catccctggt 4560cctgtgggag aacaaggtct ccctggagct gcaggccaag atggaccacc tggtcctatg 4620ggacctcctg gcttacctgg tctcaaaggt gaccctggct ccaagggtga aaagggacat 4680cctggtttaa ttggcctgat tggtcctcca ggagaacaag gggaaaaagg tgaccgaggg 4740ctccctggaa ctcaaggatc tccaggagca aaaggggatg ggggaattcc tggtcctgct 4800ggtcccttag gtccacctgg tcctccaggc ttaccaggtc ctcaaggccc aaagggtaac 4860aaaggctcta ctggacccgc tggccagaaa ggtgacagtg gtcttccagg gcctcctggg 4920cctccaggtc cacctggtga agtcattcag cctttaccaa tcttgtcctc caaaaaaacg 4980agaagacata ctgaaggcat gcaagcagat gcagatgata atattcttga ttactcggat 5040ggaatggaag aaatatttgg ttccctcaat tccctgaaac aagacatcga gcatatgaaa 5100tttccaatgg gtactcagac caatccagcc cgaacttgta aagacctgca actcagccat 5160cctgacttcc cagatggtga atattggatt gatcctaacc aaggttgctc aggagattcc 5220ttcaaagttt actgtaattt cacatctggt ggtgagactt gcatttatcc agacaaaaaa 5280tctgagggag taagaatttc atcatggcca aaggagaaac caggaagttg gtttagtgaa 5340tttaagaggg gaaaactgct ttcatactta gatgttgaag gaaattccat caatatggtg 5400caaatgacat tcctgaaact tctgactgcc tctgctcggc aaaatttcac ctaccactgt 5460catcagtcag cagcctggta tgatgtgtca tcaggaagtt atgacaaagc acttcgcttc 5520ctgggatcaa atgatgagga gatgtcctat gacaataatc cttttatcaa aacactgtat 5580gatggttgta cgtccagaaa aggctatgaa aagactgtca ttgaaatcaa tacaccaaaa 5640attgatcaag tacctattgt tgatgtcatg atcaatgact ttggtgatca gaatcagaag 5700ttcggatttg aagttggtcc tgtttgtttt cttggctaag attaagacaa agaacatatc 5760aaatcaacag aaaatatacc ttggtgccac caacccattt tgtgccacat gcaagttttg 5820aataaggatg gtatagaaaa caacgctgca tatacaggta ccatttagga aataccgatg 5880cctttgtggg ggcagaatca catggcaaaa gctttgaaaa tcataaagat ataagttggt 5940gtggctaaga tggaaacagg gctgattctt gattcccaat tctcaactct ccttttccta 6000tttgaatttc tttggtgctg tagaaaacaa aaaaagaaaa atatatattc ataaaaaata 6060tggtgctcat tctcatccat ccaggatgta ctaaaacagt gtgtttaata aattgtaatt 6120attttgtgta cagttctata ctgttatctg tgtccatttc caaaacttgc acgtgtccct 6180gaattccatc tgactctaat tttatgagaa ttgcagaact ctgatggcaa taaatatatg 6240tattatgaaa aaataaagtt gtaatttctg atgactctaa gtccctttct ttggttaata 6300ataaaatgcc tttgtatat 6319 16 8368 DNA Homo sapiens 16 gcgatccgggcgccaccccg cggtcatcgg tcaccggtcg ctctcaggaa cagcagcgca 60 acctctgctccctgcctcgc ctcccgcgcg cctaggtgcc tgcgacttta attaaagggc 120 cgtcccctcgccgaggctgc agcaccgccc ccccggcttc tcgcgcctca aaatgagtag 180 ctcccactctcgggcgggcc agagcgcagc aggcgcggct ccgggcggcg gcgtcgacac 240 gcgggacgccgagatgccgg ccaccgagaa ggacctggcg gaggacgcgc cgtggaagaa 300 gatccagcagaacactttca cgcgctggtg caacgagcac ctgaagtgcg tgagcaagcg 360 catcgccaacctgcagacgg acctgagcga cgggctgcgg cttatcgcgc tgttggaggt 420 gctcagccagaagaagatgc accgcaagca caaccagcgg cccactttcc gccaaatgca 480 gcttgagaacgtgtcggtgg cgctcgagtt cctggaccgc gagagcatca aactggtgtc 540 catcgacagcaaggccatcg tggacgggaa cctgaagctg atcctgggcc tcatctggac 600 cctgatcctgcactactcca tctccatgcc catgtgggac gaggaggagg atgaggaggc 660 caagaagcagacccccaagc agaggctcct gggctggatc cagaacaagc tgccgcagct 720 gcccatcaccaacttcagcc gggactggca gagcggccgg gccctgggcg ccctggtgga 780 cagctgtgccccgggcctgt gtcctgactg ggactcttgg gacgccagca agcccgttac 840 caatgcgcgagaggccatgc agcaggcgga tgactggctg ggcatccccc aggtgatcac 900 ccccgaggagattgtggacc ccaacgtgga cgagcactct gtcatgacct acctgtccca 960 gttccccaaggccaagctga agccaggggc tcccttgcgc cccaaactga acccgaagaa 1020 agcccgtgcctacgggccag gcatcgagcc cacaggcaac atggtgaaga agcgggcaga 1080 gttcactgtggagaccagaa gtgctggcca gggagaggtg ctggtgtacg tggaggaccc 1140 ggccggacaccaggaggagg caaaagtgac cgccaataac gacaagaacc gcaccttctc 1200 cgtctggtacgtccccgagg tgacggggac tcataaggtt actgtgctct ttgctggcca 1260 gcacatcgccaagagcccct tcgaggtgta cgtggataag tcacagggtg acgccagcaa 1320 agtgacagcccaaggtcccg gcctggagcc cagtggcaac atcgccaaca agaccaccta 1380 ctttgagatctttacggcag gagctggcac gggcgaggtc gaggttgtga tccaggaccc 1440 catgggacagaagggcacgg tagagcctca gctggaggcc cggggcgaca gcacataccg 1500 ctgcagctaccagcccacca tggagggcgt ccacaccgtg cacgtcacgt ttgccggcgt 1560 gcccatccctcgcagcccct acactgtcac tgttggccaa gcctgtaacc cgagtgcctg 1620 ccgggcggttggccggggcc tccagcccaa gggtgtgcgg gtgaaggaga cagctgactt 1680 caaggtgtacacaaagggcg ctggcagtgg ggagctgaag gtcaccgtga agggccccaa 1740 gggagaggagcgcgtgaagc agaaggacct gggggatggc gtgtatggct tcgagtatta 1800 ccccatggtccctggaacct atatcgtcac catcacgtgg ggtggtcaga acatcgggcg 1860 cagtcccttcgaagtgaagg tgggcaccga gtgtggcaat cagaaggtac gggcctgggg 1920 ccctgggctggagggcggcg tcgttggcaa gtcagcagac tttgtggtgg aggctatcgg 1980 ggacgacgtgggcacgctgg gcttctcggt ggaagggcca tcgcaggcta agatcgaatg 2040 tgacgacaagggcgacggct cctgtgatgt gcgctactgg ccgcaggagg ctggcgagta 2100 tgccgttcacgtgctgtgca acagcgaaga catccgcctc agccccttca tggctgacat 2160 ccgtgacgcgccccaggact tccacccaga cagggtgaag gcacgtgggc ctggattgga 2220 gaagacaggtgtggccgtca acaagccagc agagttcaca gtggatgcca agcacggtgg 2280 caaggccccacttcgggtcc aagtccagga caatgaaggc tgccctgtgg aggcgttggt 2340 caaggacaacggcaatggca cttacagctg ctcctacgtg cccaggaagc cggtgaagca 2400 cacagccatggtgtcctggg gaggcgtcag catccccaac agccccttca gggtgaatgt 2460 gggagctggcagccacccca acaaggtcaa agtatacggc cccggagtag ccaagacagg 2520 gctcaaggcccacgagccca cctacttcac tgtggactgc gccgaggctg gccaggggga 2580 cgtcagcatcggcatcaagt gtgcccctgg agtggtaggc cccgccgaag ctgacatcga 2640 cttcgacatcatccgcaatg acaatgacac cttcacggtc aagtacacgc cccggggggc 2700 tggcagctacaccattatgg tcctctttgc tgaccaggcc acgcccacca gccccatccg 2760 agtcaaggtggagccctctc atgacgccag taaggtgaag gccgagggcc ctggcctcag 2820 tcgcactggtgtcgagcttg gcaagcccac ccacttcaca gtaaatgcca aagctgctgg 2880 caaaggcaagctggacgtcc agttctcagg actcaccaag ggggatgcag tgcgagatgt 2940 ggacatcatcgaccaccatg acaacaccta cacagtcaag tacacgcctg tccagcaggg 3000 tccagtaggcgtcaatgtca cttatggagg ggatcccatc cctaagagcc ctttctcagt 3060 ggcagtatctccaagcctgg acctcagcaa gatcaaggtg tctggcctgg gagagaaggt 3120 ggacgttggcaaagaccagg agttcacagt caaatcaaag ggtgctggtg gtcaaggcaa 3180 agtggcatccaagattgtgg gcccctcggg tgcagcggtg ccctgcaagg tggagccagg 3240 cctgggggctgacaacagtg tggtgcgctt cctgccccgt gaggaagggc cctatgaggt 3300 ggaggtgacctatgacggcg tgcccgtgcc tggcagcccc tttcctctgg aagctgtggc 3360 ccccaccaagcctagcaagg tgaaggcgtt tgggccgggg ctgcagggag gcagtgcggg 3420 ctcccccgcccgcttcacca tcgacaccaa gggcgccggc acaggtggcc tgggcctgac 3480 ggtggagggcccctgtgagg cgcagctcga gtgcttggac aatggggatg gcacatgttc 3540 cgtgtcctacgtgcccaccg agcccgggga ctacaacatc aacatcctct tcgctgacac 3600 ccacatccctggctccccat tcaaggccca cgtggttccc tgctttgacg catccaaagt 3660 caagtgctcaggccccgggc tggagcgggc caccgctggg gaggtgggcc aattccaagt 3720 ggactgctcgagcgcgggca gcgcggagct gaccattgag atctgctcgg aggcggggct 3780 tccggccgaggtgtacatcc aggaccacgg tgatggcacg cacaccatta cctacattcc 3840 cctctgccccggggcctaca ccgtcaccat caagtacggc ggccagcccg tgcccaactt 3900 ccccagcaagctgcaggtgg aacctgcggt ggacacttcc ggtgtccagt gctatgggcc 3960 tggtattgagggccagggtg tcttccgtga ggccaccact gagttcagtg tggacgcccg 4020 ggctctgacacagaccggag ggccgcacgt caaggcccgt gtggccaacc cctcaggcaa 4080 cctgacggagacctacgttc aggaccgtgg cgatggcatg tacaaagtgg agtacacgcc 4140 ttacgaggagggactgcact ccgtggacgt gacctatgac ggcagtcccg tgcccagcag 4200 ccccttccaggtgcccgtga ccgagggctg cgacccctcc cgggtgcgtg tccacgggcc 4260 aggcatccaaagtggcacca ccaacaagcc caacaagttc actgtggaga ccaggggagc 4320 tggcacgggcggcctgggcc tggctgtaga gggcccctcc gaggccaaga tgtcctgcat 4380 ggataacaaggacggcagct gctcggtcga gtacatccct tatgaggctg gcacctacag 4440 cctcaacgtcacctatggtg gccatcaagt gccaggcagt cctttcaagg tccctgtgca 4500 tgatgtgacagatgcgtcca aggtcaagtg ctctgggccc ggcctgagcc caggcatggt 4560 tcgtgccaacctccctcagt ccttccaggt ggacacaagc aaggctggtg tggccccatt 4620 gcaggtcaaagtgcaagggc ccaaaggcct ggtggagcca gtggacgtgg tagacaacgc 4680 tgatggcacccagaccgtca attatgtgcc cagccgagaa gggccctaca gcatctcagt 4740 actgtatggagatgaagagg taccccggag ccccttcaag gtcaaggtgc tgcctactca 4800 tgatgccagcaaggtgaagg ccagtggccc cgggctcaac accactggcg tgcctgccag 4860 cctgcccgtggagttcacca tcgatgcaaa ggacgccggg gagggcctgc tggctgtcca 4920 gatcacggatcccgaaggca agccgaagaa gacacacatc caagacaacc atgacggcac 4980 gtatacagtggcctacgtgc cagacgtgac aggtcgctac accatcctca tcaagtacgg 5040 tggtgacgagatccccttct ccccgtaccg cgtgcgtgcc gtgcccaccg gggacgccag 5100 caagtgcactgtcacagtgt caatcggagg tcacgggcta ggtgctggca tcggccccac 5160 cattcagattggggaggaga cggtgatcac tgtggacact aaggcggcag gcaaaggcaa 5220 agtgacgtgcaccgtgtgca cgcctgatgg ctcagaggtg gatgtggacg tggtggagaa 5280 tgaggacggcactttcgaca tcttctacac ggccccccag ccgggcaaat acgtcatctg 5340 tgtgcgctttggtggcgagc acgtgcccaa cagccccttc caagtgacgg ctctggctgg 5400 ggaccagccctcggtgcagc cccctctacg gtctcagcag ctggccccac agtacaccta 5460 cgcccagggcggccagcaga cttgggcccc ggagaggccc ctggtgggtg tcaatgggct 5520 ggatgtgaccagcctgaggc cctttgacct tgtcatcccc ttcaccatca agaagggcga 5580 gatcacaggggaggttcgga tgccctcagg caaggtggcg cagcccacca tcactgacaa 5640 caaagacggcaccgtgaccg tgcggtatgc acccagcgag gctggcctgc acgagatgga 5700 catccgctatgacaacatgc acatcccagg aagccccttg cagttctatg tggattacgt 5760 caactgtggccatgtcactg cctatgggcc tggcctcacc catggagtag tgaacaagcc 5820 tgccaccttcaccgtcaaca ccaaggatgc aggagagggg ggcctgtctc tggccattga 5880 gggcccgtccaaagcagaaa tcagctgcac tgacaaccag gatgggacat gcagcgtgtc 5940 ctacctgcctgtgctgccgg gggactacag cattctagtc aagtacaatg aacagcacgt 6000 cccaggcagccccttcactg ctcgggtcac aggtgacgac tccatgcgta tgtcccacct 6060 aaaggtcggctctgctgccg acatccccat caacatctca gagacggatc tcagcctgct 6120 gacggccactgtggtcccgc cctcgggccg ggaggagccc tgtttgctga agcggctgcg 6180 taatggccacgtggggattt cattcgtgcc caaggagacg ggggagcacc tggtgcatgt 6240 gaagaaaaatggccagcacg tggccagcag ccccatcccg gtggtgatca gccagtcgga 6300 aattggggatgccagtcgtg ttcgggtctc tggtcagggc cttcacgaag gccacacctt 6360 tgagcctgcagagtttatca ttgatacccg cgatgcaggc tatggtgggc tcagcctgtc 6420 cattgagggccccagcaagg tggacatcaa cacagaggac ctggaggacg ggacgtgcag 6480 ggtcacctactgccccacag agccaggcaa ctacatcatc aacatcaagt ttgccgacca 6540 gcacgtgcctggcagcccct tctctgtgaa ggtgacaggc gagggccggg tgaaagagag 6600 catcacccgcaggcgtcggg ctccttcagt ggccaacgtt ggtagtcatt gtgacctcag 6660 cctgaaaatccctgaaatta gcatccagga tatgacagcc caggtgacca gcccatcggg 6720 caagacccatgaggccgaga tcgtggaagg ggagaaccac acctactgca tccgctttgt 6780 tcccgctgagatgggcacac acacagtcag cgtcaagtac aagggccagc acgtgcctgg 6840 gagccccttccagttcaccg tggggcccct aggggaaggg ggagcccaca aggtccgagc 6900 tgggggccctggcctggaga gagctgaagc tggagtgcca gccgaattca gtatctggac 6960 ccgggaagctggtgctggag gcctggccat tgctgtcgag ggccccagca aggctgagat 7020 ctcttttgaggaccgcaagg acggctcctg tggtgtggct tatgtggtcc aggagccagg 7080 tgactacgaagtctcagtca agttcaacga ggaacacatt cccgacagcc ccttcgtggt 7140 gcctgtggcttctccgtctg gcgacgcccg ccgcctcact gtttctagcc ttcaggagtc 7200 agggctaaaggtcaaccagc cagcctcttt tgcagtcagc ctgaacgggg ccaagggggc 7260 gatcgatgccaaggtgcaca gcccctcagg agccctggag gagtgctatg tcacagaaat 7320 tgaccaagataagtatgctg tgcgcttcat ccctcgggag aatggcgttt acctgattga 7380 cgtcaagttcaacggtaccc acatccctgg aagccccttc aagatccgag ttggggagcc 7440 tgggcatggaggggacccag gcttggtgtc tgcttacgga gcaggtctgg aaggcggtgt 7500 cacagggaacccagctgagt tcgtcgtgaa cacgagcaat gcgggagctg gtgccctgtc 7560 ggtgaccattgacggcccct ccaaggtgaa gatggattgc caggagtgcc ctgagggcta 7620 ccgcgtcacctataccccca tggcacctgg cagctacctc atctccatca agtacggcgg 7680 cccctaccacattgggggca gccccttcaa ggccaaagtc acaggccccc gtctcgtcag 7740 caaccacagcctccacgaga catcatcagt gtttgtagac tctctgacca aggccacctg 7800 tgccccccagcatggggccc cgggtcctgg gcctgctgac gccagcaagg tggtggccaa 7860 gggcctggggctgagcaagg cctacgtagg ccagaagagc agcttcacag tagactgcag 7920 caaagcaggcaacaacatgc tgctggtggg ggttcatggc ccaaggaccc cctgcgagga 7980 gatcctggtgaagcacgtgg gcagccggct ctacagcgtg tcctacctgc tcaaggacaa 8040 gggggagtacacactggtgg tcaaatgggg gcacgagcac atcccaggca gcccctaccg 8100 cgttgtggtgccctgagtct ggggcccgtg ccagccggca gcccccaagc ctgccccgct 8160 acccaagcagccccgccctc ttcccctcaa ccccggccca ggccgccctg gccgcccgcc 8220 tgtcactgcagctgcccctg ccctgtgccg tgctgcgctc acctgcctcc ccagccagcc 8280 gctgacctctcggctttcac ttgggcagag ggagccattt ggtggcgctg cttgtcttct 8340 ttggttctgggaggggtgag ggatgggg 8368 17 5238 DNA Homo sapiens 17 gctgtggctgcggctgcggc tgcggctgag atttggccgg gcgtccgcag gccgtggggg 60 atgggggcagcgagctccag ccctcggcgg tggcggcggc cgtaggtgtg gggcgggcgt 120 ccgcgtccggcacgcgagat ggagcgccgt ggatttcagt ttttctgact gttacatgaa 180 aggatgattgctcacaaaca gaaaaagaca aagaaaaaac gtgcttgggc atcaggtcaa 240 ctctctactgatattacaac ttctgaaatg gggctcaagt ccttaagttc caactctatt 300 tttgatccggattacatcaa ggagttggtg aatgatatca ggaagttctc ccacatctta 360 ctatatttgaaagaagccat attttcagac tgttttaaag aagttattca tatacgtcta 420 gaggaactgctccgtgtttt aaagtctata atgaataaac atcagaacct caattctgtt 480 gatcttcaaaatgctgcaga aatgctcact gcaaaagtga aagctgtgaa cttcacagaa 540 gttaatgaagaaaacaaaaa cgatctcttc caggaagtgt tttcttctat tgaaactttg 600 gcatttacctttggaaatat ccttacaaac ttccttatgg gagatgtagg caatgattca 660 ttcttgcgactgcctgtttc tcgagaaact aagtcgtttg aaaatgtttc tgtggaatca 720 gtggactcatccagtgaaaa aggaaatttt tcccctttag aactagacaa cgtgctgtta 780 aagaacactgactctatcga gctggctttg tcatatgcta aaacttggtc aaaatatact 840 aagaacatagtttcatgggt tgaaaaaaag cttaacttgg aattggagtc cactagaaat 900 atggtcaagttggcagaggc aactagaact aacattggaa ttcaggagtt catgccactg 960 cagtctctgtttactaatgc tcttcttaat gatatagaaa gcagtcacct tttacaacaa 1020 acaattgcagctctccaggc taacaaattt gtgcagcctc tacttggaag gaaaaatgaa 1080 atggaaaaacaaaggaaaga aataaaagag ctttggaaac aggagcaaaa taaaatgctt 1140 gaagcagagaatgctctcaa aaaggcaaaa ttattatgca tgcaacgtca agatgaatat 1200 gagaaagcaaagtcttccat gtttcgtgca gaagaggagc atctgtcttc aagtggcgga 1260 ttagcaaaaaatctcaacaa gcaactagaa aaaaagcgaa ggttggaaga ggaggctctc 1320 caaaaagtagaagaagcaga tgaactttac aaagtttgtg tgacaaatgt tgaagaaaga 1380 agaaatgatgtagaaaatac caaaagagaa attttagcac aactccggac acttgttttc 1440 cagtgtgatcttacccttaa agcggtaaca gttaacctct tccacatgca gcatctgcag 1500 gctgcttcccttgcagacag attacagtct ctctgtggta gtgccaaact ctatgaccca 1560 ggccaagagtacagtgaatt tgtcaaggcc acaaattcaa ctgaagaaga aaaagttgat 1620 ggaaatgtaaataaacattt aaatagttcc caaccttcag gatttggacc tgccaactct 1680 ttagaggatgttgtacgcct tcctgacagt tctaataaaa ttgaagagga cagatgctct 1740 aacagtgcagatataacagg tccttccttt ataagatcat ggacatttgg gatgtttagt 1800 gattctgagagcactggagg gagcagcgaa tctagatctc tggattcaga atctataagt 1860 ccaggagactttcatcgaaa acttccacga acaccatcca gtggaactat gtcctctgca 1920 gatgatctagatgaaagaga gccaccttcc ccttcagaaa ctggacccaa ttcccttgga 1980 acatttaagaaaacattgat gtcaaaggca gctctcacac acaagtttcg caaattgaga 2040 tcccccacgaaatgtaggga ttgtgaaggc attgtagtgt tccaaggtgt tgaatgtgaa 2100 gagtgtctccttgtttgtca tcgaaagtgt ttggaaaatt tagtcattat ttgtggtcat 2160 cagaaacttccaggaaaaat acacttattt ggagcagaat tcacactagt tgcaaaaaag 2220 gaaccagatggtatcccttt tatactcaaa atatgtgcct cagagattga aaatagagct 2280 ttgtgtctacagggaattta tcgtgtgtgt ggaaacaaaa taaaaactga aaaattgtgt 2340 ctagctttggaaaatggtat gcacttggta gatatttcag aatttagttc acatgatatc 2400 tgtgacgtcttgaaattata ccttcggcag ctcccagaac catttatttt atttcgattg 2460 tacaaggaatttatagacct tgcaaaagag atccaacatg taaatgaaga acaagagaca 2520 aaaaagaatagtcttgaaga caaaaaatgg ccaaatatgt gtatagaaat aaaccgaatt 2580 cttctaaaaagcaaagacct tctaagacaa ttgccagcat caaattttaa cagtcttcat 2640 ttccttatagtacatctaaa gcgggtagta gatcatgcag aagaaaacaa gatgaactcc 2700 aaaaacttgggggtgatatt tggaccaagt ctcattaggc caaggccaca aactgctcct 2760 atcaccatctcctcccttgc agagtattca aatcaagcac gcttggtaga gtttctcatt 2820 acttactcacagaagatctt cgatgggtcc ctacaaccac aagatgttat gtgtagcata 2880 ggtgttgttgatcaaggctg ttttccaaag cctctgttat caccagaaga aagagacatt 2940 gaacgttccatgaagtcact atttttttct tcaaaggaag atatccatac ttcagagagt 3000 gaaagcaaaatttttgaacg agctacatca tttgaggaat cagaacgcaa gcaaaatgcg 3060 ttaggaaaatgtgatgcatg tctcagtgac aaagcacagt tgcttctaga ccaagaggct 3120 gaatcagcatcccaaaagat agaagatggt aaagccccta agccactttc tctgaaatct 3180 gataggtcaacaaacaatgt ggagaggcat actccaagga ccaagattag acctgtaagt 3240 ttgcctgtagatagactact tcttgcaagt cctcctaatg agagaaatgg cagaaatatg 3300 ggaaatgtaaatttagacaa gttttgcaag aatcctgcct ttgaaggagt taatagaaaa 3360 gacgctgctactactgtttg ttccaaattt aatggctttg accagcaaac tctacagaaa 3420 attcaggacaaacagtatga acaaaacagc ctaactgcca agactacaat gatcatgccc 3480 agtgcactccaggaaaaagg agtgacaaca agcctccaga ttagtgggga ccattctatc 3540 aatgccactcaacccagtaa gccatatgca gagccagtca ggtcagtgag agaggcatct 3600 gagagacggtcttcagattc ctaccctctc gctcctgtca gagcacccag aacactgcag 3660 cctcaacattggacaacatt ttataaacca catgctccca tcatcagtat cagggggaat 3720 gaggagaagccagcttcacc ctcagcagca tgccctcctg gcacagatca cgatccccac 3780 ggtctcgtggtgaagtcaat gccagaccca gacaaagcat cagcttgtcc tgggcaagca 3840 actggtcaacctaaagaaga ctctgaggag cttggcttgc ctgatgtgaa tccaatgtgt 3900 cagagaccaaggctaaaacg aatgcaacag tttgaagacc tcgaagatga aattccacaa 3960 tttgtgtagggatgtcaaat ttcagggttt ttttgttgtt gttgtgttat tttgtggtat 4020 tgtgcttgttttgtgaaaga atgttttgac agggcccctt ttgtatagga ctgccaaatc 4080 atgggttttgccttttgttg ttgtatttat cctctgttgg taatactgaa tggtagaatg 4140 ttttgatagggtcacatttg tgcctcactg gaattatctt taaattctgt atttttaaag 4200 ttgtgaataagataggtgga ttcgtatttt ttaaagttca gttgactttc cccaccaaat 4260 ggtccatttgaatgcatccc taatatatga tatagtctca actaataggt gcaatttggg 4320 aaaatcaggtttattttttg gagtggaact gttataagtg cttatttata aaaggaatgt 4380 ttctgaatgcaagtgcctaa aaagatcttt gttggtatgc atatgttttg tcacacaatt 4440 tatagtgcatctttcaccat ttgtgctttt ttaagatagt atgtaagctc ttatttttca 4500 attggcaattcagttaattt ttaaatgttt acataatggc cagaaggctt gcaaatctgt 4560 atttaattgcattttaatta attgccagtt tttacatgta gtagtcagtt gtacaaagaa 4620 aatgcacttaaacctgtttc taaattatat attcagttat attatatttg gctttagatg 4680 gttttaatacatttgatagt ttttcacccc ttggctttat tttatataaa cttttgtttt 4740 tcagcagttctgaacttttt agtattttat aaatggtcca aaaaatgcct gtttcagaag 4800 tttttgaattcagtgcattt cctcttgatt tgtctgggtt aaaaccattc cttttgtatg 4860 aaatgttttgacttaggaat cattttatgt acttgttcta cctggattgt caacaactga 4920 aagtacatatttcatccaaa tcaagctaaa atttatttaa gttgattctg agagtacagg 4980 tcagtaagcctcattatttg gaatttgaga gaagtatagg tgatcggatc tgtttcattt 5040 ataaaaggtccagtttttag gactagtaca ttcctgttat tttctgggtt ttatcatttt 5100 gcctaaaataggatataaaa gggacaaaaa ataagtagac tgtttttatg tgtgaattat 5160 atttctactaaatgtttttg tatgactgtg ttatacttga taatatatat atatatatat 5220 aaaaaaaaaaaaaaaaaa 5238 18 4929 DNA Homo sapiens Unsure (3529)..(3529) a, or c, org, or t 18 cctagtatca cactgtgcca ccaacgagtc tgtggagtcc atcaccgccttccagtgccc 60 cacctgccgg catgtcatca ccctcagcca gcgaggtcta gacgggctcaagcgcaacgt 120 caccctacag aacatcatcg acaggttcca gaaagcatca gtgagcgggcccaactctcc 180 cagcgagacc cgtcgggagc gggcctttga cgccaacacc atgacctccgccgagaaggt 240 cctctgccag ttttgtgacc aggatcctgc ccaggacgct gtgaagacctgtgtcacttg 300 tgaagtatcc tactgtgacg agtgcctgaa agccactcac ccgaataagaagccctttac 360 aggccatcgt ctgattgagc caattccgga ctctcacatc cgggggctgatgtgcttgga 420 gcatgaggat gagaaggtga atatgtactg tgtgaccgat gaccagttaatctgtgcctt 480 gtgtaaactg gttgggcggc accgcgatca tcaggtggca gctttgagtgagcgctatga 540 caaattgaag caaaacttag agagtaacct caccaacctt attaagaggaacacagaact 600 ggagaccctt ttggctaaac tcatccaaac ctgtcaacat gttgaagtcaatgcatcacg 660 tcaagaagcc aaattgacag aggagtgtga tcttctcatt gagatcattcagcaaagacg 720 acagattatt ggaaccaaga tcaaagaagg gaaggtgatg aggcttcgcaaactggctca 780 gcagattgca aactgcaaac agtgcattga gcggtcagca tcactcatctcccaagcgga 840 acactctctg aaggagaatg atcatgcgcg tttcctacag actgctaagaatatcaccga 900 gagagtctcc atggcaactg catcctccca ggttctaatt cctgaaatcaacctcaatga 960 cacatttgac acctttgcct tagatttttc ccgagagaag aaactgctagaatgtctgga 1020 ttaccttaca gctcccaacc ctcccacaat tagagaagag ctctgcacagcttcatatga 1080 caccatcact gtgcattgga cctccgatga tgagttcagc gtggtctcctacgagctcca 1140 gtacaccata ttcaccggac aagccaacgt cgttagtctg tgtaattcggctgatagctg 1200 gatgatagta cccaacatca agcagaacca ctacacggtg cacggtctgcagagcggcac 1260 caagtacatc ttcatggtca aggccatcaa ccaggcgggc agccgcagcagtgagcctgg 1320 gaagttgaag acaaacagcc aaccatttaa actggatccc aaatctgctcatcgaaaact 1380 gaaggtgtcc catgataact tgacagtaga acgtgatgag tcatcatccaagaagagtca 1440 cacacctgaa cgcttcacca gccaggggag ctatggagta gctggaaatgtgtttattga 1500 tagtggccgg cattattggg aagtggtcat aagtggaagc acatggtatgccattggtct 1560 tgcttacaaa tcagccccga agcatgaatg gattgggaag aactctgcttcctgggcgct 1620 ctgccgctgc aacaataact gggtggtgag acacaatagc aaggaaatccccattgagcc 1680 tgccccccac ctccggcgcg tgggcatcct gctggactat gataacggctctatcgcctt 1740 ttatgatgct ttgaactcca tccacctcta caccttcgac gtcgcatttgcgcagcctgt 1800 ttgccccacc ttcaccgtgt ggaacaagtg tctgacgatt atcactgggctccctatccc 1860 agaccatttg gactgcacag agcagctgcc gtgagcgtct ggccacatggagctgctttc 1920 tggggaacag taaggttcag gccactattt aggggactga gaaagcacaggcttcatgag 1980 tgtaatgaaa tctcaccaga agtgtcccga aatcggctca gatagggctcaaaacaagag 2040 attcctctcc ttttactgtg tcttgtatta agtacgggct ttaataatttctttaatttt 2100 tttgtattta gaggaaaatc tatagattat ttataagaga aacataatcaggattacaac 2160 ttttaggaat tacttggttt tgcacattaa gaagcccata agtttatcagctatttacaa 2220 ccttcatttc atcacaatct gtgggcttac aaaaaaacaa aaaacttttgtagttttgta 2280 tgttactcat cttcttacct gatatcccat gatgatccca tggtaggtcttctcacctcg 2340 atggtgcata acaggatgtg tttgaaccta gtaggggagg aaacaggctttcttactctg 2400 gtttaatttg aagtgtttta attgtgatgt caaaaagttg tatcagatcaactaaaatgg 2460 agagcaagac agagaatgaa aagagttgat tttggacctc ggaccttgccgtggctaaat 2520 ctttaccttc tcatagctga tgggataatg ttggaaagaa aggttgtgaatcctttggcc 2580 acattttgcc ctgcttctct cagggttaag ggttctggaa gaacattaagaatgagatgc 2640 aattgaaaat agtcattttg aatcctattg attattcaaa aattcaggctgattgtcttt 2700 tatcagaggt aggattctgt tttatagtat agaatctact ttatcccttccttttaatag 2760 ttcctttaga cctgtgaaat ttcttcacta catttaatag ttctcctatttcccgctccc 2820 ccatatcaat tttccttttg tctccggggc tgagtaaata aacatgttctgtcacaaata 2880 gcagcaccac tttggattga ttttgctctc caggacatca gcacatggccctgatcagca 2940 ctaccacatc caaacataag tcactgaaaa acacttaata tttatgagttggtaatgaca 3000 agggacattg tataaagtac tatttgctag attcatgcct caaaagttattataaacaga 3060 cctttattaa acacatcttg aaagatgtag aagtccctct atagtctagtatagtttaca 3120 atagagttgt aagaccaaaa aaaaaaaaaa aaaaaattca aactcgttattcaggaacct 3180 gcttataaaa tgtcagctgg gattctttgc atgccaatct gatgcgttggaatggtccat 3240 gaattaaggc tttcttgagc agttcttggc ccagaactct ggcattggttctagtttgat 3300 gaagggcatg acctctataa atggtttcaa ttgctaaaat atttacctgggatactgggt 3360 cagccatttt gactgagcag actagtggat ttagacattg tttttagttattttgttttt 3420 aaccaaatca accaactgcc tccctgaaat aagtcaatga actcattgtttcagcatcac 3480 gtggccaaag gtcatgtgat tgcaaatctg gatttcaagg gaggccaanccagcttcctg 3540 ggtccttcca tcctcttccc tagcagacac tctccttttt cttaacagataggattctat 3600 atatttacta tattatttat acccagtatg aatattttga tagatacctaagacaatttc 3660 acatctaaaa gatggacgcc tcaatggaaa aaaaacaatc tttctctggaaaccttatag 3720 gtttttcttt ttattacaat ataaaagcaa tgtgtgtttg ccttctctgagtaaactgaa 3780 agggttgtct cagtaatttt tacatacatt ttgggtaact ggataatggatattttaatg 3840 cactttgtac actaacaggt tctaaataaa agggtctaaa actcagcttctgagttttta 3900 aaatcacggt ctccaggtac caataaatgc tacagtttgc cttatgatgttaacataaaa 3960 cacttagtag aaggacaata tttccatgaa aataatgttt ttcaatattaagaagttact 4020 actcaaattt tcacagtaag ccatttaggg tatgtttggc tatttttataaggacatgag 4080 agattatgtc ataattttgt tgtggaagtc tcactcttgg ctaacttaaaagcattgtgg 4140 atagtagcag ttactagttc caggttgtca tatttacagg aaaatatgtatatggtgaaa 4200 ggccaccgtg tttaattact ataatgatgt agaaaagatt cccgtgtgaatttttttttt 4260 tgaaagtcta aaaaatgtat gctgtaaaaa tttgctgcag tgtaatttgcattctcttaa 4320 actgattgag gtcacagtat tttattattg gggtcctcac cacaggaaacactgcgatac 4380 aggggcaaaa gagatggcag tgccaattaa attaatacaa caaaatcaatgcagcaccaa 4440 ccaagactgc caggtctggt gtcatgggta tgcccagagc ccaggagttcagaagggccc 4500 taagcctgat ttaatgctct gctgttgatg tcttgaaatt cttaacaatttttgaacaag 4560 gggcctgcgt tttcacttcg cactgggcct tgcaaattac atagcgagtgctcataaaag 4620 aactcagaaa cgtggtacct ctcttcctgg tggatacaaa taaagaaatctggatccaaa 4680 gttgaaagtt gctggcgata tcattcaagt aggactctaa atagtggattaagatgaggg 4740 tgggcctggg tgaagattct ttccagcttt aaaagaaagt gacttcaaaaactgactgca 4800 aatattgacg atggtttctg ctggaggaaa agaaacagct tgaatacagacaggcttttt 4860 tattacggta ctgatatatt gaccttaaac ttgctgagga actgaactaacgtcctccag 4920 tgaccgtgg 4929 19 3614 DNA Homo sapiens 19 gtccgccaaaacctgcgcgg atagggaaga acagcacccc ggcgccgatt gccgtaccaa 60 acaagcctaacgtccgctgg gccccggacg ccgcgcggaa aagatgaatt tacaaccaat 120 tttctggattggactgatca gttcagtttg ctgtgtgttt gctcaaacag atgaaaatag 180 atgtttaaaagcaaatgcca aatcatgtgg agaatgtata caagcagggc caaattgtgg 240 gtggtgcacaaattcaacat ttttacagga aggaatgcct acttctgcac gatgtgatga 300 tttagaagccttaaaaaaga agggttgccc tccagatgac atagaaaatc ccagaggctc 360 caaagatataaagaaaaata aaaatgtaac caaccgtagc aaaggaacag cagagaagct 420 caagccagaggatattcatc agatccaacc acagcagttg gttttgcgat taagatcagg 480 ggagccacagacatttacat taaaattcaa gagagctgaa gactatccca ttgacctcta 540 ctaccttatggacctgtctt attcaatgaa agacgatttg gagaatgtaa aaagtcttgg 600 aacagatctgatgaatgaaa tgaggaggat tacttcggac ttcagaattg gatttggctc 660 atttgtggaaaagactgtga tgccttacat tagcacaaca ccagctaagc tcaggaaccc 720 ttgcacaagtgaacagaact gcaccacccc atttagctac aaaaatgtgc tcagtcttac 780 taataaaggagaagtattta atgaacttgt tggaaaacag cgcatatctg gaaatttgga 840 ttctccagaaggtggtttcg atgccatcat gcaagttgca gtttgtggat cactgattgg 900 ctggaggaatgttacacggc tgctggtgtt ttccacagat gccgggtttc actttgctgg 960 agatgggaaacttggtggca ttgttttacc aaatgatgga caatgtcacc tggaaaataa 1020 tatgtacacaatgagccatt attatgatta tccttctatt gctcaccttg tccagaaact 1080 gagtgaaaataatattcaga caatttttgc agttactgaa gaatttcagc ctgtttacaa 1140 ggagctgaaaaacttgatcc ctaagtcagc agtaggaaca ttatctgcaa attctagcaa 1200 tgtaattcagttgatcattg atgcatacaa ttccctttcc tcagaagtca ttttggaaaa 1260 cggcaaattgtcagaaggag taacaataag ttacaaatct tactgcaaga acggggtgaa 1320 tggaacaggggaaaatggaa gaaaatgttc caatatttcc attggagatg aggttcaatt 1380 tgaaattagcataacttcaa ataagtgtcc aaaaaaggat tctgacagct ttaaaattag 1440 gcctctgggctttacggagg aagtagaggt tattcttcag tacatctgtg aatgtgaatg 1500 ccaaagcgaaggcatccctg aaagtcccaa gtgtcatgaa ggaaatggga catttgagtg 1560 tggcgcgtgcaggtgcaatg aagggcgtgt tggtagacat tgtgaatgca gcacagatga 1620 agttaacagtgaagacatgg atgcttactg caggaaagaa aacagttcag aaatctgcag 1680 taacaatggagagtgcgtct gcggacagtg tgtttgtagg aagagggata atacaaatga 1740 aatttattctggcaaattct gcgagtgtga taatttcaac tgtgatagat ccaatggctt 1800 aatttgtggaggaaatggtg tttgcaagtg tcgtgtgtgt gagtgcaacc ccaactacac 1860 tggcagtgcatgtgactgtt ctttggatac tagtacttgt gaagccagca acggacagat 1920 ctgcaatggccggggcatct gcgagtgtgg tgtctgtaag tgtacagatc cgaagtttca 1980 agggcaaacgtgtgagatgt gtcagacctg ccttggtgtc tgtgctgagc ataaagaatg 2040 tgttcagtgcagagccttca ataaaggaga aaagaaagac acatgcacac aggaatgttc 2100 ctattttaacattaccaagg tagaaagtcg ggacaaatta ccccagccgg tccaacctga 2160 tcctgtgtcccattgtaagg agaaggatgt tgacgactgt tggttctatt ttacgtattc 2220 agtgaatgggaacaacgagg tcatggttca tgttgtggag aatccagagt gtcccactgg 2280 tccagacatcattccaattg tagctggtgt ggttgctgga attgttctta ttggccttgc 2340 attactgctgatatggaagc ttttaatgat aattcatgac agaagggagt ttgctaaatt 2400 tgaaaaggagaaaatgaatg ccaaatggga cacgggtgaa aatcctattt ataagagtgc 2460 cgtaacaactgtggtcaatc cgaagtatga gggaaaatga gtactgcccg tgcaaatccc 2520 acaacactgaatgcaaagta gcaatttcca tagtcacagt taggtagctt tagggcaata 2580 ttgccatggttttactcatg tgcaggtttt gaaaatgtac aatatgtata atttttaaaa 2640 tgttttattattttgaaaat aatgttgtaa ttcatgccag ggactgacaa aagacttgag 2700 acaggatggttattcttgtc agctaaggtc acattgtgcc tttttgacct tttcttcctg 2760 gactattgaaatcaagctta ttggattaag tgatatttct atagcgattg aaagggcaat 2820 agttaaagtaatgagcatga tgagagtttc tgttaatcat gtattaaaac tgatttttag 2880 ctttacatatgtcagtttgc agttatgcag aatccaaagt aaatgtcctg ctagctagtt 2940 aaggattgttttaaatctgt tattttgcta tttgcctgtt agacatgact gatgacatat 3000 ctgaaagacaagtatgttga gagttgctgg tgtaaaatac gtttgaaata gttgatctac 3060 aaaggccatgggaaaaattc agagagttag gaaggaaaaa ccaatagctt taaaacctgt 3120 gtgccattttaagagttact taatgtttgg taacttttat gccttcactt tacaaattca 3180 agccttagataaaagaaccg agcaattttc tgctaaaaag tccttgattt agcactattt 3240 acatacaggccatactttac aaagtatttg ctgaatgggg accttttgag ttgaatttat 3300 tttattatttttattttgtt taatgtctgg tgctttctat cacctcttct aatcttttaa 3360 tgtatttgtttgcaattttg gggtaagact tttttatgag tactttttct ttgaagtttt 3420 agcggtcaatttgccttttt aatgaacatg tgaagttata ctgtggctat gcaacagctc 3480 tcacctacgcgagtcttact ttgagttagt gccataacag accactgtat gtttacttct 3540 caccatttgagttgcccatc ttgtttcaca ctagtcacat tcttgtttta agtgccttta 3600 gttttaacagttca 3614 20 4986 DNA Homo sapiens 20 aggaggctgc cgctctggct tgccgccccccgccgccgct gcacaccgga cccagccgcc 60 gtgccgcggg ccatggacct gcccaggggcctggtggtgg cctgggcgct cagcctgtgg 120 ccagggttca cggacacctt caacatggacaccaggaagc cccgggtcat ccctggctcc 180 aggaccgcct tctttggcta cacagtgcagcagcacgaca tcagtggcaa taagtggctg 240 gtcgtgggcg ccccactgga aaccaatggctaccagaaga cgggagacgt gtacaagtgt 300 ccagtgatcc acgggaactg caccaaactcaacctgggaa gggtcaccct gtccaacgtg 360 tccgagcgga aagacaacat gcgcctcggccttagtctcg ccaccaaccc caaggacaac 420 agcttcctgg cctgcagccc cctctggtctcatgagtgtg ggagctccta ctacaccaca 480 gggatgtgtt caagagtcaa ctccaacttcaggttctcca agaccgtggc cccagctctc 540 caaaggtgcc agacctacat ggacatcgtcattgtcctgg atggctccaa cagcatctac 600 ccctgggtgg aggttcagca cttcctcatcaacatcctga aaaagtttta cattggccca 660 gggcagatcc aggttggagt tgtgcagtatggcgaagatg tggtgcatga gtttcacctc 720 aatgactaca ggtctgtaaa agatgtggtggaagctgcca gccacattga gcagagagga 780 ggaacagaga cccggacggc atttggcattgaatttgcac gctcagaggc tttccagaag 840 ggtggaagga aaggagccaa gaaggtgatgattgtcatca cagatgggga gtcccacgac 900 agcccagacc tggagaaggt gatccagcaaagcgaaagag acaacgtaac aagatatgcg 960 gtggccgtcc tgggctacta caaccgcagggggatcaatc cagaaacttt tctaaatgaa 1020 atcaaataca tcgccagtga ccctgatgacaagcacttct tcaatgtcac tgatgaggct 1080 gccttgaagg acattgtcga tgccctgggggacagaatct tcagcctgga aggcaccaac 1140 aagaacgaga cctcctttgg gctggagatgtcacagacgg gcttttcctc gcacgtggtg 1200 gaggatgggg ttctgctggg agccgtcggtgcctatgact ggaatggagc tgtgctaaag 1260 gagacgagtg ccgggaaggt cattcctctccgcgagtcct acctgaaaga gttccccgag 1320 gagctcaaga accatggtgc atacctggggtacacagtca catcggtcgt gtcctccagg 1380 caggggcgag tgtacgtggc cggagccccccggttcaacc acacgggcaa ggtcatcctg 1440 ttcaccatgc acaacaaccg gagcctcaccatccaccagg ctatgcgggg ccagcagata 1500 ggctcttact ttgggagtga aatcacctcggtggacatcg acggcgacgg cgtgactgat 1560 gtcctgctgg tgggcgcacc catgtacttcaacgagggcc gtgagcgagg caaggtgtac 1620 gtctatgagc tgagacagaa ccggtttgtttataacggaa cgctaaagga ttcacacagt 1680 taccagaatg cccgatttgg gtcctccattgcctcagttc gagacctcaa ccaggattcc 1740 tacaatgacg tggtggtggg agcccccctggaggacaacc acgcaggagc catctacatc 1800 ttccacggct tccgaggcag catcctgaagacacctaagc agagaatcac agcctcagag 1860 ctggctaccg gcctccagta ttttggctgcagcatccacg ggcaattgga cctcaatgag 1920 gatgggctca tcgacctggc agtgggagcccttggcaacg ctgtgattct gtggtcccgc 1980 ccagtggttc agatcaatgc cagcctccactttgagccat ccaagatcaa catcttccac 2040 agagactgca agcgcagtgg cagggatgccacctgcctgg ccgccttcct ctgcttcacg 2100 cccatcttcc tggcacccca tttccaaacaacaactgttg gcatcagata caacgccacc 2160 atggatgaga ggcggtatac accgagggcccacctggacg agggcgggga ccgattcacc 2220 aacagagccg tactgctctc ctccggccaggagctctgtg agcggatcaa cttccatgtc 2280 ctggacactg ctgactacgt gaagccagtgaccttctcag tcgagtattc cctggaggac 2340 cctgaccatg gccccatgct ggacgacggctggcccacca ctctcagagt ctcggtgccc 2400 ttctggaacg gctgcaatga ggatgagcactgtgtccctg accttgtgtt ggatgcccgg 2460 agtgacctgc ccacggccat ggagtactgccagagggtgc tgaggaagcc tgcgcaggac 2520 tgctccgcat acacgctgtc cttcgacaccacagtcttca tcatagagag cacacgccag 2580 cgagtggcgg tggaggccac actggagaacaggggcgaga acgcctacag cacggtccta 2640 aatatctcgc agtcagcaaa cctgcagtttgccagcttga tccagaagga ggactcagac 2700 ggtagcattg agtgtgtgaa cgaggagaggaggctccaga agcaagtctg caacgtcagc 2760 tatcccttct tccgggccaa ggccaaggtggctttccgtc ttgattttga gttcagcaaa 2820 tccatcttcc tacaccacct ggagatcgagctcgctgcag gcagtgacag taatgagcgg 2880 gacagcacca aggaagacaa cgtggcccccttacgcttcc acctcaaata cgaggctgac 2940 gtcctcttca ccaggagcag cagcctgagccactacgagg tcaagctcaa cagctcgctg 3000 gagagatacg atggtatcgg gcctcccttcagctgcatct tcaggatcca gaacttgggc 3060 ttgttcccca tccacgggat tatgatgaagatcaccattc ccatcgccac caggagcggc 3120 aaccgcctac tgaagctgag ggacttcctcacggacgagg tagcgaacac gtcctgtaac 3180 atctggggca atagcactga gtaccggcccaccccagtgg aggaagactt gcgtcgtgct 3240 ccacagctga atcacagcaa ctctgatgtcgtctccatca actgcaatat acggctggtc 3300 cccaaccagg aaatcaattt ccatctactggggaacctgt ggttgaggtc cctaaaagca 3360 ctcaagtaca aatccatgaa aatcatggtcaacgcagcct tgcagaggca gttccacagc 3420 cccttcatct tccgtgagga ggatcccagccgccagatcg tgtttgagat ctccaagcaa 3480 gaggactggc aggtccccat ctggatcattgtaggcagca ccctgggggg cctcctactg 3540 ctggccctgc tggtcctggc actgtggaagctcggcttct ttagaagtgc caggcgcagg 3600 agggagcctg gtctggaccc cacccccaaagtgctggagt gaggctccag aggagacttt 3660 gagttgatgg gggccaggac accagtccaggtagtgttga gacccaggcc tgtggcccca 3720 ccgagctgga gcggagagga agccagctggctttgcactt gacctcatct cccgagcaat 3780 ggcgcctgct ccctccagaa tggaactcaagctggtttta agtggaactg ccctactggg 3840 agactgggac acctttaaca cagacccctagggatttaaa gggacacccc tacacacacc 3900 caggcccacg ccaaggcctc cctcaggctctgtggagggc atttgctgcc ccagctacta 3960 aggtgctagg aattcgtaat catccccatcctccagagaa acccagggag gaagactgta 4020 aatacgaacc caatctgcac actccaggcctctagttcca gaaggatcca agacaaaaca 4080 gatctgaatt ctgccctttt ctctcacccatcccacccct ccattggctc ccaagtcaca 4140 cccactccct tccccataga taggcccctggggctcccga agaatgaacc caagagcaag 4200 ggcttgatgg tgacagctgc aagccagggatgaagaaaga ctctgagatg tggagactga 4260 tggccaggca agtgggacca ggatactggacgctgtcctg agatgagagg tagccgggct 4320 ctgcacccac gtgcattcac attgaccgcaactcacacat tcccccacca gctgcagccc 4380 cttgctctca gctgccaacc ctcccgggtcacttttgttc ccaggtacct catgggaagc 4440 atgtggatga cacaatccct ggggctgtgcattcccacgt cttcttgctg cagcctgccc 4500 ctagacatgg acgcaccggc ctggctgcagctgggcagca ggggtagggg tagggagcct 4560 cccctccctg tatcaccccc tccctacacacacacacaca cacacacaca cacactgcct 4620 cccatccttc cctcatgccc gccagtgcacagggaagggc ttggccagcg ctgttgaggg 4680 gtcccctctg gaatgcactg aataaagcacgtgcaaggac tcccggagcc tgtgcagcct 4740 tggtggcaaa tatctcatct gccggcccccaggacaagtg gtatgaccag tgataatgcc 4800 ccaaggacaa ggggcgtgcc tggcgcccagtggagtaatt tatgccttag tcttgttttg 4860 aggtagaaat gcaaggggga cacatgaaaggcatcagtcc ccctgtgcat agtacgacct 4920 ttactgtcgt atttttgaaa aattaaaaatacagtgttta aaaacaaaaa aaaaaaaaaa 4980 aaaaaa 4986 21 2005 DNA Homosapiens 21 tcgagcggcc gcccgggcag gtggagtcgg cggctcagtt gtccatgaccctgaaggtcc 60 aggagtaccc gaccctcaag gtgccctacg agacgctgaa caaacgctttcgcgccgctc 120 agaagaacat tgaccgggag accagccacg tcaccatggt ggtggccgagctggagagga 180 cgttgagcgg ctgccccgcc gtggactccg tggtcagcct gctggacggcgtggtggaga 240 agctcagcgt cctcaagagg aaggcggtgg aatccatcca ggccgaggacgagagcgcca 300 agctgtgcaa gctccggatc gagcacctca aagagcatag cagcgaccagcccgcggcgg 360 ccagcgtgtg gaagaggaag cgcatggatc gcatgatggt ggagcacctgctgcgttgcg 420 gctactacaa cacggctgtc aagctggcgc gccagagcgg catcgagagctgcctggagt 480 tcagcctcag aatccaggag ttcattgaac tcatccggca gaataagagactggacgctg 540 tgagacatgc aagaaagcac ttcagccaag cagaagggag ccagctggacgaggtgcgcc 600 aggccatggg catgctggcc ttcccgcccg acacgcacat ctccgtacaaggaccttctg 660 gaccctgcac ggtggcggac ctgcacggtg gcggatctga tccagcagttccggtacgac 720 aactaccgac tacaccagct gggaaacaat tctgtgttca ccctcaccctgcaggccggc 780 ctctcagcca tcaagacacc acagtgctac aaggaggacg gcagctccaagagccctgtc 840 tgccctgtgt gcagccgctc cctgaacaag ctggcgcacg cctgcccatggcccactgtg 900 ccaactcccg cctggtctgc aagatttctg gcgacgtgat gaacgagaacaatccgccca 960 tgatgctgcc caacggctac gtctacggct acaattctct gctttctatccgtcaagatg 1020 ataaagtcgt gtgcccgaga accaaagaag tcttccactt ctcacaagccgagaaggtgt 1080 acatcatgta tgccccacgt cgtgaagcgc accctcgggg acgggctgcagtgggcgggg 1140 aggcacgctt cctcctgtcc cacgctccag cctgccgcgg cgtttctgtttcttgcgacc 1200 aaagatccgt gagcaacgat aaatactctt aggaagagag aaaataaggtttcataagtt 1260 tgtacttgaa aacatttgga ttggtaggat tttgtaacac gtcaaccatttgatgcttct 1320 gaaaagtact ttcaacttgc gaaggaaact cttctttaaa gactgacctaaacaccgagg 1380 gaaacttaag aacgtttaaa atataggagt ccgtgatttc cctgtgttttcagtttcttt 1440 ccttctgtga acgatgagac ttggagaacg ggctggtcct tcaccacttcctgttggccc 1500 tggctggccg ggaaggtggc agcggcaccg gagggacact tatggcttcattcgagagct 1560 gctgccaaaa cgcctgcgcg ccaccgtcgg gggctggctt cgaggacgccgcctgctcgc 1620 gggtcgtgtc cgcgggactg tgttcgtacg tgcatagttt cgatatcacatcgcggggct 1680 gtgttcggag tctgcgtcgt ttcgtataca caccctctgt gtgcgccttacttcctgctt 1740 cgagaatgta taacgtggaa atccacggga ccaaatttct gcagaggccttgccggatgg 1800 ttccataact gtagagtcta attgctatcc attacagaaa ttaatcgttcagttgaaaga 1860 agtactgatg acttttcaaa acaaatgaac caccgtagct gacagagaaccgtatcgtaa 1920 gaggtttgta gttagtgctt atttttgcat gttgatgttg actagctaataaactgtaaa 1980 tgtaaaaaaa aaaaaaaaaa aaaaa 2005 22 4354 DNA Homosapiens 22 gaattcgggg ggcgagtaag ccagcggcag gaccagcggg cgggggccacaacaaaagct 60 ggcaggctga cagaggcggc ctcaggacgg accttctggc tactgaccgttttgctgtgg 120 ttttcccgga ttgtgtgtag gtgtgagatc aaccatgagt tccgttgcagttttgaccca 180 agagagtttt gctgaacacc gaagtgggct ggttccgcaa caaatcaaagttgccactct 240 aaattcagaa gaggagagcg accctccaac ctacaaggat gccttccctccacttcctga 300 gaaagctgct tgcctggaaa gtgcccagga acccgctgga gcctgggggaacaagatccg 360 acccatcaag gcttctgtca tcactcaggt gttccatgta cccctggaggagagaaaata 420 caaggatatg aaccagtttg gagaaggtga acaagcaaaa atctgccttgagatcatgca 480 gagaactggt gctcacttgg agctgtcttt ggccaaagac caaggcctctccatcatggt 540 gtcaggaaag ctggatgctg tcatgaaagc tcggaaggac attgttgctagactgcagac 600 tcaggcctca gcaactgttg ccattcccaa agaacaccat cgctttgttattggcaaaaa 660 tggagagaaa ctgcaagact tggagctaaa aactgcaacc aaaatccagatcccacgccc 720 agatgacccc agcaatcaga tcaagatcac tggcaccaaa gagggcatcgagaaagctcg 780 ccatgaagtc ttactcatct ctgccgagca ggacaaacgt gctgtggagaggctagaagt 840 agaaaaggca ttccacccct tcatcgctgg gccgtataat agactggttggcgagatcat 900 gcaggagaca ggcacgcgca tcaacatccc cccacccagc gtgaaccggacagagattgt 960 cttcactgga gagaaggaac agttggctca ggctgtggct cgcatcaagaagatttatga 1020 ggagaagaaa aagaagacta caaccattgc agtggaagtg aagaaatcccaacacaagta 1080 tgtcattggg cccaagggca attcattgca ggagatcctt gagagaactggagtttccgt 1140 tgagatccca ccctcagaca gcatctctga gactgtaata cttcgaggcgaacctgaaaa 1200 gttaggtcag gcgttgactg aagtctatgc caaggccaat agcttcaccgtctcctctgt 1260 cgccgcccct tcctggcttc accgtttcat cattggcaag aaagggcagaacctggccaa 1320 aatcactcag cagatgccaa aggttcacat cgagttcaca gagggcgaagacaagatcac 1380 cctggagggc cctacagagg atgtcaatgt ggcccaggaa cagatagaaggcatggtcaa 1440 agatttgatt aaccggatgg actatgtgga gatcaacatc gaccacaagttccacaggca 1500 cctcattggg aagagcggtg ccaacataaa cagaatcaaa gaccagtacaaggtgtccgt 1560 gcgcatccct cctgacagtg agaagagcaa tttgatccgc atcgagggggacccacaggg 1620 cgtgcagcag gccaagcgag agctgctgga gcttgcatct cgcatggaaaatgagcgtac 1680 caaggatcta atcattgagc aaagatttca tcgcacaatc attgggcagaagggtgaacg 1740 gatccgtgaa attcgtgaca aattcccaga ggtcatcatt aactttccagacccagcaca 1800 aaaaagtgac attgtccagc tcagaggacc taagaatgag gtggaaaaatgcacaaaata 1860 catgcagaag atggtggcag atctggtgga aaatagctat tcaatttctgttccgatctt 1920 caaacagttt cacaagaata tcattgggaa aggaggcgca aacattaaaaagattcgtga 1980 agaaagcaac accaaaatcg accttccagc agagaatagc aattcagagaccattatcat 2040 cacaggcaag cgagccaact gcgaagctgc ccggagcagg attctgtctattcagaaaga 2100 cctggccaac atagccgagg tagaggtctc catccctgcc aagctgcacaactccctcat 2160 tggcaccaag ggccgtctga tccgctccat catggaggag tgcggcggggtccacattca 2220 ctttcccgtg gaaggttcag gaagcgacac cgttgttatc aggggcccttcctcggatgt 2280 ggagaaggcc aagaagcagc tcctgcatct ggcggaggag aagcaaaccaagagtttcac 2340 tgttgacatc cgcgccaagc cagaatacca caaattcctc atcggcaaggggggcggcaa 2400 aattcgcaag gtgcgcgaca gcactggagc acgtgtcatc ttccctgcggctgaggacaa 2460 ggaccaggac ctgatcacca tcattggaaa ggaggacgcc gtccgagaggcacagaagga 2520 gctggaggcc ttgatccaaa acctggataa tgtggtggaa gactccatgctggtggaccc 2580 caagcaccac cgccacttcg tcatccgcag aggccaggtc ttgcgggagattgctgaaga 2640 gtatggcggg gtgatggtca gcttcccacg ctctggcaca cagagcgacaaagtcaccct 2700 caagggcgcc aaggactgtg tggaggcagc caagaaacgc attcaggagatcattgagga 2760 cctggaagct caggtgacat tagaatgtgc tataccccag aaattccatcgatctgtcat 2820 gggccccaaa ggttccagaa tccagcagat tactcgggat ttcagtgttcaaattaaatt 2880 cccagacaga gaggagaacg cagttcacag tacagagcca gttgtccaggagaatgggga 2940 cgaagctggg gaggggagag aggctaaaga ttgtgacccc ggctctccaaggaggtgtga 3000 catcatcatc atctctggcc ggaaagaaaa gtgtgaggct gccaaggaagctctggaggc 3060 attggttcct gtcaccattg aagtagaggt gccctttgac cttcaccgttacgttattgg 3120 gcagaaagga agtgggatcc gcaagatgat ggatgagttt gaggtgaacatacatgtccc 3180 ggcacctgag ctgcagtctg acatcatcgc catcacgggc ctcgctgcaaatttggaccg 3240 ggccaaggct ggactgctgg agcgtgtgaa ggagctacag gccgagcaggaggaccgggc 3300 tttaaggagt tttaagctga gtgtcactgt agaccccaaa taccatcccaagattatcgg 3360 gagaaagggg gcagtaatta cccaaatccg gttggagcat gacgtgaacatccagtttcc 3420 tgataaggac gatgggaacc agccccagga ccaaattacc atcacagggtacgaaaagaa 3480 cacagaagct gccagggatg ctatactgag aattgtgggt gaacttgagcagatggtttc 3540 tgaggacgtc ccgctggacc accgcgttca cgcccgcatc attggtgcccgcggcaaagc 3600 cattcgcaaa atcatggacg aattcaaggt ggacattcgc ttcccacagagcggagcccc 3660 agaccccaac tgcgtcactg tgacggggct cccagagaat gtggaggaagccatcgacca 3720 catcctcaat ctggaggagg aatacctagc tgacgtggtg gacagtgaggcgctgcaggt 3780 atacatgaaa cccccagcac acgaagaggc caaggcacct tccagaggctttgtggtgcg 3840 ggacgcaccc tggaccgcca gcagcagtga gaaggctcct gacatgagcagctctgagga 3900 atttcccagc tttggggctc aggtggctcc caagaccctc ccttggggccccaaacgata 3960 atgatcaaaa agaacagaac cctctccagc ctgctgaccc gaacccaaccacacaatggt 4020 ttgtctcaat ctgacccagc ggctggaccc tccgtaaatt gttgagcgctcttccccttc 4080 ccgaggtccg cagggagcct agcgcctggc tgtgtgtgcg gccgctcctccaggcctggc 4140 cgtgcccgct caggacctgc tccactgttt aacaataaac caaggtcatgagcattcgag 4200 ctaagataac agactccagc tcctggtcca cccggcatgt cagtcagcactctggccttc 4260 atcacgagag ctccgcagcc gtggctagga ttccacttcc tgtgtcatgacctcaggaaa 4320 taaacgtcct tgactttata aaagccccga attc 4354 23 2669 DNAHomo sapiens 23 tgacgtgaga ggagacttcc ggccactgcg ttgtagtcgg cccggctgcaaagcgttttt 60 ctgcaggctg ttttcccagg ttccctcggc ctgtacctcg cgcactcctcttgctccagg 120 tccttcagtc tccgctcgtc tcaccgtagg ctgtgacgac atgagcaacaaagaaggatc 180 aggagggttc aggaaaagga agcatgacaa tttcccacat aaccaaagaagagaagggaa 240 ggatgttaat tcatcttcac ccgtgatgtt ggcctttaaa tcatttcagcaggaacttga 300 tgcaaggcat gacaaatatg agagacttgt gaaacttagt cgggatataactgttgaaag 360 taaaaggaca atttttctcc tccataggat tacaagtgct cctgatatggaagatatatt 420 gactgaatca gaaattaaat tggatggtgt cagacaaaag atattccaggtagcccaaga 480 gctatcaggg gaagatatgc atcagttcca tcgagccatt actacaggactacaggaata 540 tgtggaagct gtctcttttc aacacttcat caaaacacga tcattaattagtatggatga 600 aattaataaa caattgatat ttacgactga agacaatggg aaagaaaataaaactccctc 660 ctctgatgca caggataagc agtttggtac ttggagactg agagtcacacctgtcgatta 720 ccttctggga gtggctgact taactggaga attgatgcgg atgtgtattaacagtgtggg 780 gaatggggac attgataccc cctttgaagt gagccagttt ttacgtcaggtttatgatgg 840 gttttcattc attggcaaca ctggacctta cgaggtttct aagaagctgtataccttgaa 900 acaaagtttg gccaaagtgg agaatgcttg ttatgccttg aaagtcagagggtcagaaat 960 tccaaaacat atgttggcag atgtgttttc agttaaaaca gaaatgatagatcaagaaga 1020 gggcatttct tagaatctaa cgttactcag ttactaattc ttttgagaactcctaagaga 1080 ccaatttgta agacttattt agtatttcat ttaactttat tgtggcttttacatagaaac 1140 atattcagtt gtacttgttt taaattgtat acaagctgta cataaaattagccaaatgaa 1200 tcatttctta tatcttattc atgaaagttt gcatacagat gtttgcatatatgccttttt 1260 gaatttttgc tggttaacct ttatcattta tctttgtaat gtgaacatgcttcagagtgt 1320 accttttgcc ataacctatt ttaatttact ttttctgaag tttggagtgataatttttag 1380 tggaagcaat ttgtaattta agttggtagt tatattatat atagaaaagattttttagtt 1440 aaacttctgg acatgagcgt cctgtttaaa tttcttgtta atatcgtgccaagcctcaaa 1500 aataggctta ttccatggaa caagaattaa aaatgaataa gctatcaatatataatttaa 1560 gtacaagttt aggctgggcg tggtggctca cgcctgtaat cccagcactttgggaggccg 1620 aggtaggcag atcacgaggt caagagatca agaccagcct gaccaacgtggcgaaacctc 1680 gtctctacta gaaatacaaa aattagctgg atatggtggt atgtgcctgtaatcccagct 1740 acttgggagg ctgaggcagg agactcgctt gaacctggga ggcagaggttgcagtgagcc 1800 gagattgcgc cactgcactc cagcctgggt aacagagcaa gactccatctcaaaaaaaga 1860 aagaaagaaa aaagaaagta caagtttata aagtattata gtgaaaaattcgcattctgg 1920 ctgattttaa gccatttaaa atttatataa aacaaccttc cataaaaatttgacaggtgc 1980 ccagatgttg ctttctccat ttattttttg ttttttttta atcacagtaggtctgataga 2040 gaattggagc taaattataa tatttttgtt ggtaaagttg agttatatacttgtacatac 2100 aatggaaatg cttttagtag tgattattta gcaatttttg tttttgttatattaggcatg 2160 tttggaggct ttcctattct agcatttaaa tttaaatttt attaaaattaaataatttaa 2220 atctagcatt taaatttaaa taatttaagt ctagcattta cttttaaataattataatga 2280 agttttgaaa tactaagtta atccagacct ttagttgtcc catggtgttaataaagttgc 2340 caaagaagat gtattatgaa caattcagca ataagacaat tgtcaacacagttgagaata 2400 acaatggtaa tcgttagtaa tatttagaat tggaatttgc ctactgaaatagttatagat 2460 gattacttgt gatgtgaaac tgaattgagc atgacaacca gacatttccagttggttttg 2520 taagttttga gaatctagat actgggtttt attttttgaa agattagctctgtttgtaag 2580 ggctgattcc ttgaaaatgt aattttccag aaaaacacct aaagaaaataaaacatggac 2640 atgcctagta aaaaaaaaaa aaaaaaaaa 2669 24 5392 DNA Homosapiens 24 ggaattaaga atagtcaggt ggtgagtgga acgtctcttg gggtgtcggaattcaaaacg 60 gacctggagg atgttgatct ccaagaacat gccctggcgg cggctgcagggcatttcctt 120 cgggatgtat tcggctgaag agctcaagaa attaagtgtt aaatccattacgaaccctcg 180 atacctggac agcctgggga acccatcggc aaacggcctg tacgatttagctttgggccc 240 tgcagattcc aaagaggtgt gctccacctg cgtgcaggac ttcagcaactgttctgggca 300 cctgggccac attgagctcc cactcacagt gtataaccct ctcctcttcgataagctgta 360 cctgctgctt cggggctctt gtttaaactg ccacatgctg acttgtccccgggccgtgat 420 tcacctctta ctctgccagc tgagggttct ggaagtcggg gccctacaagcagtctacga 480 gcttgagaga attctgagca ggtttctgga agaaaatgcc gatccctctgcctctgaaat 540 tcgggaggaa ttagaacaat acacaactga aattgtgcag aacaacctcctggggtccca 600 gggcgcacat gtaaagaacg tgtgtgagag caagagcaag ctcattgctctcttctggaa 660 ggcacatatg aatgctaagc gctgtcccca ctgcaagacc gggcgatccgttgtccgaaa 720 ggaacacaac agcaagttga ctatcacatt tccagccatg gtgcacaggacagctggcca 780 gaaggactct gagcccctgg gaattgagga agctcagata ggaaaacgaggatacttaac 840 acccaccagt gcccgcgaac acctttctgc cctgtggaag aatgaaggattctttctgaa 900 ctaccttttt tcgggaatgg atgatgatgg tatggaatcc agattcaatcccagtgtgtt 960 ctttctagat ttcttggtgg tgccgccctc aaggtctcgc ccagtcagtcgcctaggaga 1020 ccagatgttt actaatggcc agacggtgaa cttgcaggct gtcatgaaggatgtagttct 1080 gattcgaaaa cttctggcat tgatggccca agaacagaag ttgccagaggaagtggccac 1140 acccactaca gatgaggaaa aagactcttt gattgctatt gaccgatcctttttgagtac 1200 acttccaggc cagtccctca tagacaaact ttacaacatt tggattcgccttcagagcca 1260 cgtcaatatt gtgtttgata gcgagatgga caaactaatg agggacaagtacccaggcat 1320 taggcagatc ctggagaaga aagaaggcct gttccgaaaa cacatgatgggaaagcgagt 1380 ggactcgact gcgcgctcag tcatctgccc agacatgtac atcaacaccaacgaaattgg 1440 aattcccatg gtgtttgcca caaaactgac ctacccacag ccagttaccccatggaatgt 1500 tcaggaactt aggcaagcgg tcatcaacgg ccctaatgtg cacccaggagcctccatggt 1560 catcaatgag gacggcagcc gcacagccct gagcgctgtg gacatgacccagcgagaggc 1620 cgtggccaag cagcttctga ccccagccac gggggcacct aagccccaggggacaaaaat 1680 tgtgtgccgg catgtgaaga atggggacat tctgctactg aaccgacagcccacactgca 1740 cagaccctcc atccaggccc accgtgcccg catcctgcct gaagagaaagtgctgcggct 1800 ccactatgcc aactgcaagg cctataatgc cgactttgat ggagacgagatgaatgccca 1860 tttcccccag agtgagctgg gccgggccga ggcctacgtc ctggcctgcactgatcagca 1920 gtaccttgtt cccaaggatg gccaaccatt ggcgggactg atccaggatcacatggtttc 1980 aggggcaagc atgactactc ggggttgctt tttcacccgg gagcactatatggagctggt 2040 gtaccgagga ctcacggaca aagtggggcg cgtgaagctc ctttctccttccatcctgaa 2100 gccctttccg ctgtggacag gaaaacaggt tgtgtcaacg ctgctcataaatataatccc 2160 agaggaccac atcccactga acttatctgg aaaggcgaaa atcactgggaaagcctgggt 2220 gaaggaaact cctcgatccg ttcctggctt taaccctgac tcgatgtgcgagtcccaggt 2280 gatcatcagg gaaggggagc tgctctgcgg agtgctggac aaggcgcactatgggagctc 2340 cgcctacggc ctggtccact gctgctatga gatctatgga ggcgagaccagcggcaaggt 2400 tctaacctgc ctggcccgcc tcttcaccgc ctacctgcag ctctacagaggcttcacctt 2460 gggcgtggaa gacattttgg tgaagccaaa gcgagatgtc aagaggcaacgtatcattga 2520 agaatccacc cactgcgggc cccaggctgt cagggctgca ttaaacctgccagaagccgc 2580 atcatatgat gaggtccgag gaaaatggca ggatgcccat ctgggcaaggaccagaggga 2640 ttttaacatg attgatctga agttcaagga ggaagtgaac cattacagcaatgagattaa 2700 caaggcatgc atgccttttg gcctacacag acagttccca gagaacacgctgcagctgat 2760 ggtgcagtcg ggagccaaag gttcaactgt gaacacgatg cagatctcgtgcctgctggg 2820 ccagattgaa ctggaaggtc ggagcacccc gctgatggcg tctggcaagtcactgccctg 2880 ctttgagcct tatgagttca cccccagggc tggtggcttt gtcactggcaggttcctcac 2940 cggcatcaaa cctcctgagt tcttcttcca ctgcatggca ggacgagagggcctggtgga 3000 cactgctgtg aaaaccagcc gctcaggcta tctccaaagg tgcatcatcaagcacctaga 3060 ggggctggtc gtgcagtatg atctcacggt ccgtgacagt gacggcagtgtggtgcagtt 3120 cctgtatggg gaggatggcc tggacatccc caagacacag ttcctgcagcccaagcagtt 3180 ccccttcctg gccagcaact acgaggtgat aatgaaatca cagcatctccatgaagtttt 3240 atccagagca gatcccaaaa aagctctcca ccacttcaga gctatcaaaaaatggcaaag 3300 caagcacccc aacaccctgc tgagaagagg cgccttcttg agttattcccagaaaattca 3360 ggaagctgtg aaagccctga aacttgagag tgaaaaccgc aatggccgcagaccctggga 3420 ctcagggagg atgctgagga tgtggtatga gttggatgag gaaagccgaaggaaatacca 3480 gaagaaggcg gccgcttgtc ctgaccccag tctgtctgtc tggcgtcctgacatctactt 3540 tgcatcagtg tcagaaacat ttgaaacaaa ggttgatgac tacagtcaagagtgggcagc 3600 tcaaacagag aagagttatg agaaatcaga gctttctctc gacaggttgaggaccttgct 3660 gcagctgaag tggcagcgct cactgtgtga gccgggcgag gctgtgggcctgctggctgc 3720 ccagagcatc ggagagccct ccacccagat gaccctcaac accttccactttgcaggcag 3780 aggcgagatg aacgtcaccc tgggcattcc aaggttgcgg gagattctcatggtggccag 3840 cgccaacatc aagacaccca tgatgagcgt gcccgtgctc aacaccaagaaagccctgaa 3900 gagagtgaaa agcctgaaga agcaactcac cagggtgtgc ttgggggaggtgttgcagaa 3960 aattgacgtc caggagtcct tctgtatgga agaaaaacag aacaaattccaggtgtacca 4020 gctgcggttt cagttcctgc cacatgcata ttaccagcag gagaagtgcctgagacccga 4080 ggacatcctg cgcttcatgg aaacaagatt ctttaaactt ctgatggaatccatcaaaaa 4140 gaagaataat aaagcatcag ctttcaggaa cgtaaacact cgaagagctacacagcggga 4200 tctggacaac gctggggagt tggggaggag tcggggagag caggagggtgatgaggaaga 4260 ggaggggcac attgtggatg ctgaagctga ggagggagac gccgatgcctctgatgccaa 4320 acgcaaggag aagcaggagg aggaggttga ttatgagagt gaggaagaggaggagaggga 4380 gggcgaggag aacgacgatg aagacatgca ggaggaacga aatccccacagggaaggtgc 4440 tcgaaagacc caagagcaag atgaagaggt gggcttagga ggacccgtcccttcccaccc 4500 tcctgacgca gccccggaaa cccacccaca gccaggagcc ccaggggccgaggccatgga 4560 gcgccgggtc caggctgtgc gtgagatcca cccgttcata gatgactaccagtacgacac 4620 cgaggagagc ctgtggtgcc aggtgacagt gaagctccct ctgatgaagatcaactttga 4680 catgagctcc ctggtagtat ctttggccca tggtgccgtc atctatgcgaccaagggcat 4740 cactcggtgc ctcctgaatg aaacaaccaa caataagaac gagaaggagcttgtgctaaa 4800 cacagaagga atcaacctcc cagagctatt caagtatgca gaggtcctggatctgcgccg 4860 cctctactcc aacgacatcc acgccatagc caacacgtat ggcattgaggcgctgcgggt 4920 gatcgagaag gagatcaagg atgtgtttgc cgtgtatggc atcgcggtcgaccctcgcca 4980 tctctccctg gttgctgatt atatgtgctt cgagggtgtt tacaagccactgaatcgctt 5040 tgggatccgg tcaaactctt ccccgctaca gcagatgaca tttgaaaccagcttccagtt 5100 tctgaagcaa gccaccatgc tgggatccca cgatgagctg aggtctccttctgcctgcct 5160 tgtggtcggg aaagtcgtca ggggcgggac aggcctgttc gagctcaagcagcctctgag 5220 atagcagcta ccccggcacc atctgcccag ctccaaggac ccttggtgagggtggttggc 5280 cagccctgcc ttctgcatga gaggaccagg agactggaat ccagggcagttccaagtgac 5340 agtacagagc acagcagcga ccttgggcct gaaagcagtg ggcctctgagct 5392 25 1353 DNA Homo sapiens 25 gatctcaaga tggcgctgca ctcaatgcggaaagcgcgtg agcgctggag cttcatccgg 60 gcacttcata agggatccgc agctgctcccgctctccaga aagacagcaa gaagcgagta 120 ttttccggca ttcaacctac aggaatcctccacctgggca attacctggg agccattgag 180 agctgggtga ggttacagga tgaatatgactctgtattat acagcattgt tgacctccac 240 tccattactg tcccccaaga cccagctgtccttcggcaga gcatcctgga catgactgct 300 gttcttcttg cctgtggcat aaacccggaaaaaagcatcc ttttccaaca atctcaggtg 360 tctgaacaca cacaattaag ttggatcctttcctgcatgg tcagactacc tcgattacaa 420 catttacatc agtggaaggc aaagactaccaagcagaagc acgatggcac ggtgggcctg 480 ctcacatacc cagtactcca ggcagccgacattctgttgt acaagtccac acacgttcct 540 gttggggagg atcaagtcca gcacatggaactagttcagg atctagcaca aggtttcaac 600 aagaagtatg gggagttctt tccagtgcccgagtccattc tcacatccat gaagaaggta 660 aaatccctac gtgatccttc tgccaaaatgtcgaaatcag accctgacaa actggccacc 720 gtccgaataa cagacagccc agaggagatagtgcagaaat tccgcaaggc tgtgacagac 780 ttcacctcgg aggtcaccta tgacccggctggccgcgctg gcgtgtccaa catagtggcg 840 gtgcatgccg cggtgacggg gctctccgtggaggaagtgg tgcgccgcag cgcgggcatg 900 aacactgctc gctacaagct ggccgtggcagatgctgtga ttgagaagtt tgccccaatt 960 aagcgtgaaa ttgaaaaact gaagctggacaaggaccatt tagagaaggt tttacaaatt 1020 ggatcagcaa aagccaaaga attagcatacactgtgtgcc aggaggtgaa gaaattggtg 1080 ggttttctat aggaagtttc aacgaatcacagcaaggctt ttgtgccttg cactccatgc 1140 attctgataa cggcagcttt cctaaaaagaaaaagttata gttttgggac atttaatttg 1200 gtatagctga ttattggctt tatttgatgaatattgcttt gtagctttga aatacgacag 1260 tgttccaaat cccatcaaca aaatgctgtgaacaacaaca acaaaaaata aatcaagaag 1320 gcatagcaaa aaaaaaaaaa aaaaaaaaaaaaa 1353 26 2889 DNA Homo sapiens 26 atggatgaac aggctctatt agggctaaatccaaatgctg attcagactt tagacaaagg 60 gccctggcct attttgagca gttaaaaatttccccagatg cctggcaggt gtgtgcagaa 120 gctctagccc agaggacata cagtgatgatcatgtgaagt ttttctgctt tcaagtactg 180 gaacatcaag ttaaatacaa atactcagaactaaccactg ttcaacaaca gctaattagg 240 gagacgctca tatcatggct gcaagctcagatgctgaatc cccaaccaga gaagaccttt 300 atacgaaata aagccgccca agtcttcgccttgctttttg ttacagagta tctcactaag 360 tggcccaagt ttttttttga cattctctcagtagtggacc taaatccaag gggagtagat 420 ctctacctgc gaatcctcat ggctattgattcagagttgg tggatcgtga tgtggtgcat 480 acatcagagg aggctcgtag gaatactctcataaaagata ccatgaggga acagtgcatt 540 ccaaatctgg tggaatcatg gtaccaaatattacaaaatt atcagtttac taattctgaa 600 gtgacgtgtc agtgccttga agtagttggggcttatgtct cttggataga cttatccctt 660 atagccaatg ataggtttat aaatatgctgctaggtcata tgtcaataga agttctacgg 720 gaagaagcat gtgactgttt atttgaagttgtaaataaag gaatggaccc tgttgataaa 780 atgaaactag tggaatcttt gtgtcaagtattacagtctg ctgggttttt cagcattgac 840 caggaagaag atgttgactt cctggccagattttctaagt tggtaaatgg aatgggacag 900 tcattgatag ttagttggag taaattaattaagaatgggg atattaagaa tgctcaagag 960 gcactacaag ctattgaaac aaaagtggcactgatgttgc agctactaat tcatgaggat 1020 gatgatattt cttctaatat tattggattttgttacgatt atcttcatat tttgaaacgg 1080 cttacagtgc tctcggatca gcaaaaagctaatgtagagg caatcatgtt ggccgttatg 1140 aaaaaattga cttacgatga agaatataactttgaaaatg agggtgaaga tgaagccatg 1200 tttgtagaat atagaaaaca actgaagttactgttggaca ggcttgctca agtttcacca 1260 gagttactac tggcctctgt tcgcagagtttttagttcta cactgcagaa ttggcagact 1320 acacggttta tggaagttga agtagcaataagattgctgt atatgttggc agaagctctt 1380 ccagtatctc atggtgctca cttctcaggtgatgtttcaa aagctagtgc tttgcaggat 1440 atgatgcgaa ctctggtaac atcaggagtcagttcctatc agcatacatc tgtgacattg 1500 gagttcttcg aaactgttgt tagatatgaaaagtttttca cagttgaacc tcagcacatt 1560 ccatgtgtac taatggcttt cttagatcacagaggtctgc ggcattccag tgcaaaagtt 1620 cggagcagga cggcttacct gttttctagatttgtcaaat ctctcaataa gcaaatgaat 1680 cctttcattg aggatatttt gaatagaatacaagatttat tagagctttc tccacctgag 1740 aatggccacc agtccttact gagcagcgatgatcaacttt ttatttatga gacagctgga 1800 gtgctgattg ttaatagtga atatccggcagaaaggaaac aagccttaat gaggaatctg 1860 ttgactccac taatggagaa gtttaaaattctgttagaaa agttgatgct ggcacaagat 1920 gaagaaaggc aagcctccct agcagactgtcttaaccatg ctgttggatt tgcaagtcga 1980 accagtaaag ctttcagcaa caaacagactgtgaaacaat gtggctgttc cgaagtttat 2040 ctggactgtt tacagacatt cttgccagccctcagttgtc ccttacaaaa ggatattctc 2100 agaagtggag tccgtacttt ccttcatcgaatgattattt gcctggagga agaagttctt 2160 ccgttcattc catctgcttc agaacatatgctcaaagatt gtgaagcaaa agatctccag 2220 gagttcattc ctcttatcaa ccagattacggccaaattca agatacaggt atccccgttt 2280 ttacaacaga tgttcatgcc cctgcttcatgcaatttttg aagtgctgct ccggccagca 2340 gaagaaaatg accagtctgc tgctttagagaagcagatgt tgcggaggag ttactttgct 2400 ttcctgcaaa cagtcacagg cagtgggatgagcgaagtta tagcaaatca aggtgcagag 2460 aatgtagaaa gagtgttggt tactgttatccaaggagcag ttgaatatcc agatccaatt 2520 gcacagaaaa catgttttat catcctctcaaagttggtag aactctgggg aggtaaagat 2580 ggaccagtgg gatttgctga ttttgtttataagcacattg tccccgcatg tttcctagca 2640 cctttaaaac aaacctttga cctggcagatgcacaaacag tattggcttt atctgagtgt 2700 gcagtgacac tgaaaacaat tcatctcaaacggggcccag aatgtgttca gtatcttcaa 2760 caagaatacc tgccctcctt gcaagtagctccagaaataa ttcaggagtt ttgtcaagcg 2820 cttcagcagc ctgatgctaa agtttttaaaaattacttaa aggtgttctt ccagagagca 2880 aagccctga 2889 27 2748 DNA Homosapiens 27 cgccacccct gattgcggtg ccacggactg ctcctgctgg gcggagaggacagattttgc 60 aaagcggagg ctgcgacggg tcctgcaggg ggacagtgag gaaagggccgcctcgtctcc 120 gctcctgggg gaccgcagaa ataagaatca aactccacaa tgacaacctatttggaattc 180 attcaacaaa atgaagaacg agatggagtc cgatttagtt ggaatgtttggccatcaagt 240 cgactggaag ctacaagaat ggttgttcct gtggcagccc tgtttacaccactgaaagag 300 agacctgact taccacctat tcaatatgaa cctgttctgt gtagtaggaccacttgccgt 360 gcagttttga atcctttatg tcaagtggat tatcgagcaa aactttgggcttgcaacttt 420 tgttaccaaa ggaatcagtt tccacctagt tatgctggta tatctgaactgaatcagcct 480 gctgaacttt tacctcagtt ttctagcatt gaatatgtag ttctgcgtggtcctcagatg 540 cctttgatat tcctctatgt ggttgatact tgcatggaag atgaagatttacaagccctg 600 aaagaatcca tgcagatgtc attaagtctt ttaccaccta cagctttggttggacttatt 660 acttttggga gaatggttca ggttcatgaa cttggatgtg aaggcatttcaaaaagctat 720 gtcttcagag gaacaaaaga tttgtctgcc aaacaactgc aggaaatgctggggctctct 780 aaagtaccag ttactcaagc aacacgtggt cctcaggtac agcagccacctccttccaac 840 agattcttac aaccagtaca gaaaatagac atgaatctca cagatcttctgggagaactc 900 cagcgagacc cttggcctgt accacaggga aagagacctt tgcgttcctctggggtggca 960 ctttccatag ctgtaggact gctggagtgt acttttccca acactggtgctcgtatcatg 1020 atgttcattg gtggtcctgc tactcagggg cctggaatgg tggttggagatgagttgaag 1080 acacctataa gatcgtggca tgacattgac aaagacaatg ccaaatatgttaaaaaggga 1140 actaagcatt ttgaagcatt ggctaatcga gctgctacaa ctggccatgttattgatatc 1200 tatgcgtgtg cattagatca gacaggtctc ctggagatga aatgctgtcccaaccttact 1260 ggaggataca tggtaatggg tgattctttc aatacttcct tattcaaacaaacttttcaa 1320 agagtcttta ccaaagacat gcatggacag tttaaaatgg gctttggtggtacgctagaa 1380 ataaagacct caagggaaat aaagatttca ggagctattg gaccctgtgtgtcactcaat 1440 tctaaaggac cctgtgtgtc tgaaaatgag ataggaacag gtggcacatgtcagtggaag 1500 atatgtggac ttagtcccac tacaacctta gccatatatt ttgaggttgtcaatcagcat 1560 aatgctccaa ttcctcaagg agggcgtggt gcaatccagt ttgtgactcagtatcagcat 1620 tcaagtgggc agagacgcat ccgagtgacc accattgcta ggaactgggcagatgctcaa 1680 actcaaatcc aaaacattgc tgcatctttt gaccaggagg cagctgccattcttatggcc 1740 cggctagcaa tatatagagc agaaacagaa gaaggtccag atgtgcttaggtggctggac 1800 agacagctca ttcgactgtg tcagaaattt ggagaatatc ataaagatgacccaagttcc 1860 ttcagatttt cagaaacttt ctccctttat ccacagttta tgtttcatttaagaagatct 1920 tctttcctgc aagtttttaa caatagtcct gatgagagtt catattatcgtcaccatttt 1980 atgcgtcaag atctgaccca gtctctaatt atgattcagc ctatcctgtatgcgtattct 2040 tttagtggac caccagagcc ggttcttctt gatagcagta gcattcttgcagatcgtatt 2100 cttctcatgg acacattctt ccagattttg atttatcatg gtgagaccatagcacagtgg 2160 cggaagtcag gataccagga tatgcctgag tatgaaaatt tccgccaccttctgcaagcc 2220 ccagtggatg atgcacagga aattcttcac tccagatttc caatgccaagatacattgac 2280 actgaacatg gaggcagcca ggcccgtttc ctcctttcaa aagtcaacccttcacagact 2340 cataataata tgtatgcctg ggggcaggag tctggagcac ctattcttacagatgatgtt 2400 agtttacaag tgtttatgga tcacttgaag aaacttgctg tgtccagtgctgcttgaagt 2460 gctaataatg ttaaagacac cttaagaaga tgaaataata ttccaaatttcattttttcc 2520 tttttccatt tatctgtgga aaccaacaga tattgctcta tattttttgtattagtatgg 2580 tttgagacaa catatggaaa atgttcacat ttgtagatta agctggaattataatgagag 2640 caataagaac aaatttattt tgcttaccac agtgttatag ctggttctagaaatttgaag 2700 tctttataac ttaattatgt tttaataaaa aatagagtct gcctcgta2748 28 6417 DNA Homo sapiens 28 gcggctccgg gtgactcggg ccagtgtagaggtcctcagg ccgccggcag gagcagctgg 60 gccaattccc tggccgggag cggaaggggatggcgtcggg cctgggctcc ccgtccccct 120 gctcggcggg cagtgaggag gaggatatggatgcactttt gaacaacagc ctgcccccac 180 cccacccaga aaatgaagag gacccagaagaggatttgtc agaaacagag actccaaagc 240 tcaagaagaa gaaaaagcct aagaaacctcgggaccctaa aatccctaag agcaagcgcc 300 aaaaaaagga gcgtatgctc ttatgccggcagctggggga cagctctggg gaggggccag 360 agtttgtgga ggaggaggaa gaggtggctctgcgctcaga cagtgagggc agcgactata 420 ctcctggcaa gaagaagaag aagaagcttggacctaagaa agagaagaag agcaaatcca 480 agcggaagga ggaggaggag gaggatgatgatgatgatga ttcaaaggag cctaaatcat 540 ctgctcagct cctggaagac tggggcatggaagacattga ccacgtgttc tcagaggagg 600 attatcgaac cctcaccaac tacaaggccttcagccagtt tgtcagaccc ctcattgctg 660 ccaaaaatcc caagattgct gtctccaagatgatgatggt tttgggtgca aaatggcggg 720 agttcagtac caataacccc ttcaaaggcagttctggggc atcagtggca gctgcggcag 780 cagcagcggt agctgtggtg gagagcatggtgacagccac tgaggttgca ccaccacctc 840 cccctgtgga ggtgcctatc cgcaaggccaagaccaagga gggcaaaggt cccaatgctc 900 ggaggaagcc caagggcagc cctcgtgtacctgatgccaa gaagcctaaa cccaagaaag 960 tagctcccct gaaaatcaag ctgggaggttttggttccaa gcgtaagaga tcctcgagtg 1020 aggatgatga cttagatgtg gaatctgacttcgatgatgc cagtatcaat agctattctg 1080 tttctgatgg ttccaccagc cgtagtagccgcagccgcaa gaaactccga accactaaaa 1140 agaaaaagaa aggcgaggag gaggtgactgctgtggatgg ttatgagaca gaccaccagg 1200 actattgcga ggtgtgccag caaggcggtgagatcatcct gtgtgatacc tgtccccgtg 1260 cttaccacat ggtctgcctg gatcccgacatggagaaggc tcccgagggc aagtggagct 1320 gcccacactg cgagaaggaa ggcatccagtgggaagctaa agaggacaat tcggagggtg 1380 aggagatcct ggaagaggtt gggggagacctcgaagagga ggatgaccac catatggaat 1440 tctgtcgggt ctgcaaggat ggtggggaactgctctgctg tgatacctgt ccttcttcct 1500 accacatcca ctgcctgaat cccccacttccagagatccc caacggtgaa tggctctgtc 1560 cccgttgtac gtgtccagct ctgaagggcaaagtgcagaa gatcctaatc tggaagtggg 1620 gtcagccacc atctcccaca ccagtgcctcggcctccaga tgctgatccc aacacgccct 1680 ccccaaagcc cttggagggg cggccagagcggcagttctt tgtgaaatgg caaggcatgt 1740 cttactggca ctgctcctgg gtttctgaactgcagctgga gctgcactgt caggtgatgt 1800 tccgaaacta tcagcggaag aatgatatggatgagccacc ttctggggac tttggtggtg 1860 atgaagagaa aagccgaaag cgaaagaacaaggaccctaa atttgcagag atggaggaac 1920 gcttctatcg ctatgggata aaacccgagtggatgatgat ccaccgaatc ctcaaccaca 1980 gtgtggacaa gaagggccac gtccactacttgatcaagtg gcgggactta ccttacgatc 2040 aggcttcttg ggagagtgag gatgtggagatccaggatta cgacctgttc aagcagagct 2100 attggaatca cagggagtta atgaggggtgaggaaggccg accaggcaag aagctcaaga 2160 aggtgaagct tcggaagttg gagaggcctccagaaacgcc aacagttgat ccaacagtga 2220 agtatgagcg acagccagag tacctggatgctacaggtgg aaccctgcac ccctatcaaa 2280 tggagggcct gaattggttg cgcttctcctgggctcaggg cactgacacc atcttggctg 2340 atgagatggg ccttgggaaa actgtacagacagcagtctt cctgtattcc ctttacaagg 2400 agggtcattc caaaggcccc ttcctagtgagcgcccctct ttctaccatc atcaactggg 2460 agcgggagtt tgaaatgtgg gctccagacatgtatgtcgt aacctatgtg ggtgacaagg 2520 acagccgtgc catcatccga gagaatgagttctcctttga agacaatgcc attcgtggtg 2580 gcaagaaggc ctcccgcatg aagaaagaggcatctgtgaa attccatgtg ctgctgacat 2640 cctatgaatt gatcaccatt gacatggctattttgggctc tattgattgg gcctgcctca 2700 tcgtggatga agcccatcgg ctgaagaacaatcagtctaa gttcttccgg gtattgaatg 2760 gttactcact ccagcacaag ctgttgctgactgggacacc attacaaaac aatctggaag 2820 agttgtttca tctgctcaac tttctcacccccgagaggtt ccacaatttg gaaggttttt 2880 tggaggagtt tgctgacatt gccaaggaggaccagataaa aaaactgcat gacatgctgg 2940 ggccgcacat gttgcggcgg ctcaaagccgatgtgttcaa gaacatgccc tccaagacag 3000 aactaattgt gcgtgtggag ctgagccctatgcagaagaa atactacaag tacatcctca 3060 ctcgaaattt tgaagcactc aatgcccgaggtggtggcaa ccaggtgtct ctgctgaatg 3120 tggtgatgga tcttaagaag tgctgcaaccatccatacct cttccctgtg gctgcaatgg 3180 aagctcctaa gatgcctaat ggcatgtatgatggcagtgc cctaatcaga gcatctggga 3240 aattattgct gctgcagaaa atgctcaagaaccttaagga gggtgggcat cgtgtactca 3300 tcttttccca gatgaccaag atgctagacctgctagagga tttcttggaa catgaaggtt 3360 ataaatacga acgcatcgat ggtggaatcactgggaacat gcggcaagag gccattgacc 3420 gcttcaatgc accgggtgct cagcagttctgcttcttgct ttccactcga gctgggggcc 3480 ttggaatcaa tctggccact gctgacacagttattatcta tgactctgac tggaaccccc 3540 ataatgacat tcaggccttt agcagagctcaccggattgg gcaaaataaa aaggtaatga 3600 tctaccggtt tgtgacccgt gcgtcagtggaggagcgcat cacgcaggtg gcaaagaaga 3660 aaatgatgct gacgcatcta gtggtgcggcctgggctggg ctccaagact ggatctatgt 3720 ccaaacagga gcttgatgat atcctcaaatttggcactga ggaactattc aaggatgaag 3780 ccactgatgg aggaggagac aacaaagagggagaagatag cagtgttatc cactacgatg 3840 ataaggccat tgaacggctg ctagaccgtaaccaggatga gactgaagac acagaattgc 3900 agggcatgaa tgaatatttg agctcattcaaagtggccca gtatgtggta cgggaagaag 3960 aaatggggga ggaagaggag gtagaacgggaaatcattaa acaggaagaa agtgtggatc 4020 ctgactactg ggagaaattg ctgcggcaccattatgagca gcagcaagaa gatctagccc 4080 gaaatctggg caaaggaaaa agaatccgtaaacaggtcaa ctacaatgat ggctcccagg 4140 aggaccgaga ttggcaggac gaccagtccgacaaccagtc cgattactca gtggcttcag 4200 aggaaggtga tgaagacttt gatgaacgttcagaagctcc ccgtaggccc agtcgtaagg 4260 gcctgcggaa tgataaagat aagccattgcctcctctgtt ggcccgtgtt ggtgggaata 4320 ttgaagtact tggttttaat gctcgtcagcgaaaagcctt tcttaatgca attatgcgat 4380 atggtatgcc acctcaggat gcttttactacccagtggct tgtaagagac ctgcgaggca 4440 aatcagagaa agagttcaag gcatatgtctctcttttcat gcggcattta tgtgagccgg 4500 gggcagatgg ggctgagacc tttgctgatggtgtcccccg agaaggcctg tctcgccagc 4560 atgtccttac tagaattggt gttatgtctttgattcgcaa gaaggttcag gagtttgaac 4620 atgttaatgg gcgctggagc atgcctgaactggctgaggt ggaggaaaac aagaagatgt 4680 cccagccagg gtcaccctcc ccaaaaactcctacaccctc cactccaggg gacacgcagc 4740 ccaacactcc tgcacctgtc ccacctgctgaagatgggat aaaaatagag gaaaatagcc 4800 tcaaagaaga agagagcata gaaggagaaaaggaggttaa atctacagcc cctgagactg 4860 ccattgagtg tacacaggcc cctgcccctgcctcagagga tgaaaaggtc gttgttgaac 4920 cccctgaggg agaggagaaa gtggaaaaggcagaggtgaa ggagagaaca gaggaaccta 4980 tggagacaga gcccaaaggt gctgctgatgtagagaaggt ggaggaaaag tcagcaatag 5040 atctgacccc tattgtggta gaagacaaagaagagaagaa agaagaagaa gagaaaaaag 5100 aggtgatgct tcagaatgga gagacccccaaggacctgaa tgatgagaaa cagaagaaaa 5160 atattaaaca acgtttcatg tttaacattgcagatggtgg ttttactgag ttgcactccc 5220 tttggcagaa tgaagagcgg gcagccacagttaccaagaa gacttatgag atctggcatc 5280 gacggcatga ctactggctg ctagccggcattataaacca tggctatgcc cggtggcaag 5340 acatccagaa tgacccacgc tatgccatcctcaatgagcc tttcaagggt gaaatgaacc 5400 gtggcaattt cttagagatc aagaataaatttctagctcg aaggtttaag ctcttagaac 5460 aagctctggt gattgaggaa cagctgcgccgggctgctta cttgaacatg tcagaagacc 5520 cttctcaccc ttccatggcc ctcaacacccgctttgctga ggtggagtgt ttggcggaaa 5580 gtcatcagca cctgtccaag gagtcaatggcaggaaacaa gccagccaat gcagtcctgc 5640 acaaagttct gaaacagctg gaagaactgctgagtgacat gaaagctgat gtgactcgac 5700 tcccagctac cattgcccga attcccccagttgctgtgag gttacagatg tcagagcgta 5760 acattctcag ccgcctggca aaccgggcacccgaacctac cccacagcag gtagcccagc 5820 agcagtgaag atgcagactg ataccacctccaccgctgag cagtgacctt cctcactttc 5880 tcttgtccca gcttctcccc tgggggcctgagagaccctc accttccttc tgcccatctt 5940 ccatgttgta aaggaacagc cccagtgcactgggggaggg gagggagtga ggggcagtgg 6000 tgcccttcct gcagaagaga catgcagcagtagcgctggc gccatctgca ggagctggcg 6060 ggctggcctt ctggaccctg gcttctccccactgtaacgc ctgttacaca caaactgttg 6120 tgggttcctg ccaggcttga agaaaatgatctgaattttt tcctcctttt ggttttattt 6180 tgttggttta ttttgtgttt tcttttctcctttttggggg gtattcagag tgggctgggc 6240 ccctgggcga gacacagcta cctctgttggcatcttttta ataccaggaa cccagcggct 6300 ctagccactg agcggctaaa tgaaataaagtggaaaaaaa aaaaaaagga aaaaaccaaa 6360 agcataaaaa accacagcaa atttcttgatgaaaattgaa aataaaagtt tccttgt 6417 29 1560 DNA Homo sapiens 29ccctgagtca ctgcctgcgc acgtccggcc gcctggctcc ccatactagt cgccgatatt 60tggagttctt acaacatggc agacattgac aacaaagaac agtctgaact tgatcaagat 120ttggatgatg ttgaagaagt agaagaagag gaaactggtg aagaaacaaa actcaaagca 180cgtcagctaa ctgttcagat gatgcaaaat cctcagattc ttgcagccct tcaagaaaga 240cttgatggtc tggtagaaac accaacagga tacattgaaa gcctgcctag ggtagttaaa 300agacgagtga atgctctcaa aaacctgcaa gttaaatgtg cacagataga agccaaattc 360tatgaggaag ttcatgatct tgaaaggaag tatgctgttc tctatcagcc tctatttgat 420aagcgatttg aaattattaa tgcaatttat gaacctacgg aagaagaatg tgaatggaaa 480ccagatgaag aagatgagat ttcggaggaa ttgaaagaaa aggccaagat tgaagatgag 540aaaaaggatg aagaaaaaga agaccccaaa ggaattcctg aattttggtt aactgttttt 600aagaatgttg acttgctcag tgatatggtt caggaacacg atgaacctat tctgaagcac 660ttgaaagata ttaaagtgaa gttctcagat gctggccagc ctatgagttt tgtcttagaa 720tttcactttg aacccaatga atattttaca aatgaagtgc tgacaaagac atacaggatg 780aggtcagaac cagatgattc tgatcccttt tcttttgatg gaccagaaat tatgggttgt 840acagggtgcc agatagattg gaaaaaagga aagaatgtca ctttgaaaac tattaagaag 900aagcagaaac acaagggacg tgggacagtt cgtactgtga ctaaaacagt ttccaatgac 960tctttcttta acttttttgc ccctcctgaa gttcctgaga gtggagatct ggatgatgat 1020gctgaagcta tccttgctgc agacttcgaa attggtcact ttttacgtga gcgtataatc 1080ccaagatcag tgttatattt tactggagaa gctattgaag atgatgatga tgattatgat 1140gaagaaggtg aagaagcgga tgaggaaggg gaagaagaag gagatgagga aaatgatcca 1200gactatgacc caaagaagga tcaaaaccca gcagagtgca agcagcagtg aagcaggatg 1260tatgtggcct tgaggataac ctgcactggt ctaccttctg cttccctgga aaggatgaat 1320ttacatcatt tgacaagcct attttcaagt tatttgttgt ttgtttgctt gtttttgttt 1380ttgcagctaa aataaaaatt tcaaatacaa ttttagttct tacaagataa tgtcttaatt 1440ttgtaccaat tcaggtagaa gtagaggcct accttgaatt aagggttata ctcagttttt 1500aacacattgt tgaagaaaag gtaccagctt tggaacgaga tgctatacta ataagcaagt 156030 1049 DNA Homo sapiens 30 gcccggtgcc aagcgcagct agctcagcag gcggcagcggcggcctgagc ttcagggcag 60 ccagctcctc ccggtctcgc cttcctcgcg gtcagcatgaaagccttcag tcccgtgagg 120 tccgttagga aaaacagcct gtcggaccac agcctgggcatctcccggag caaaacccct 180 gtggacgacc cgatgagcct gctatacaac atgaacgactgctactccaa gctcaaggag 240 ctggtgccca gcatccccca gaacaagaag gtgagcaagatggaaatcct gcagcacgtc 300 atcgactaca tcttggacct gcagatcgcc ctggactcgcatcccactat tgtcagcctg 360 catcaccaga gacccgggca gaaccaggcg tccaggacgccgctgaccac cctcaacacg 420 gatatcagca tcctgtcctt gcaggcttct gaattcccttctgagttaat gtcaaatgac 480 agcaaagcac tgtgtggctg aataagcggt gttcatgatttcttttattc tttgcacaac 540 aacaacaaca acaaattcac ggaatctttt aagtgctgaacttatttttc aaccatttca 600 caaggaggac aagttgaatg gaccttttta aaaagaaaaaaaaaatgaag gaaaactaag 660 aatgatcatc ttcccagggt tcttacttga ctgtaattcgttatttatga aaaaaccttt 720 taaatgccct ttctgcagtt ggaaggtttt ctttatatactattcccacc atggggagcg 780 aaaacgttaa aatcacaagg aattgcccaa tctaagcagactttgccttt tttcaaaggt 840 ggagcgtgat accagaagga tccagtattc agtcacttaaatgaagtctt ttggtcagaa 900 attacctttt tcacacaagc ctactgaatg ctgtgtatatatttatatat aaatatatct 960 atttgagtga aaccttgtga acctttaatt agagtcttcttgtatagtgg cagagatgtc 1020 tattctgcat caaagtgtaa tgatgtact 1049 31 6802DNA Homo sapiens 31 cggccccaga aaacccgagc gagtaggggg cggcgcgcaggagggaggag aactgggggc 60 gcgggaggct ggtgggtgtc gggggtggag atgtagaagatgtgacgccg cggcccggcg 120 ggtgccagat tagcggacgg ctgcccgcgg ttgcaacgggatcccgggcg ctgcagcttg 180 ggaggcggct ctccccaggc ggcgtccgcg gagacacccatccgtgaacc ccaggtcccg 240 ggccgccggc tcgccgcgca ccaggggccg gcggacagaagagcggccga gcggctcgag 300 gctgggggac cgcgggcgcg gccgcgcgct gccgggcgggaggctggggg gccggggccg 360 gggccgtgcc ccggagcggg tcggaggccg gggccggggccgggggacgg cggctccccg 420 cgcggctcca gcggctcggg gatcccggcc gggccccgcagggaccatgg cagccgggag 480 catcaccacg ctgcccgcct tgcccgagga tggcggcagcggcgccttcc cgcccggcca 540 cttcaaggac cccaagcggc tgtactgcaa aaacgggggcttcttcctgc gcatccaccc 600 cgacggccga gttgacgggg tccgggagaa gagcgaccctcacatcaagc tacaacttca 660 agcagaagag agaggagttg tgtctatcaa aggagtgtgtgctaaccgtt acctggctat 720 gaaggaagat ggaagattac tggcttctaa atgtgttacggatgagtgtt tcttttttga 780 acgattggaa tctaataact acaatactta ccggtcaaggaaatacacca gttggtatgt 840 ggcactgaaa cgaactgggc agtataaact tggatccaaaacaggacctg ggcagaaagc 900 tatacttttt cttccaatgt ctgctaagag ctgattttaatggccacatc taatctcatt 960 tcacatgaaa gaagaagtat attttagaaa tttgttaatgagagtaaaag aaaataaatg 1020 tgtatagctc agtttggata attggtcaaa caattttttatccagtagta aaatatgtaa 1080 ccattgtccc agtaaagaaa aataacaaaa gttgtaaaatgtatattctc ccttttatat 1140 tgcatctgct gttacccagt gaagcttacc tagagcaatgatctttttca cgcatttgct 1200 ttattcgaaa agaggctttt aaaatgtgca tgtttagaaacaaaatttct tcatggaaat 1260 catatacatt agaaaatcac agtcagatgt ttaatcaatccaaaatgtcc actatttctt 1320 atgtcattcg ttagtctaca tgtttctaaa catataaatgtgaatttaat caattccttt 1380 catagtttta taattctctg gcagttcctt atgatagagtttataaaaca gtcctgtgta 1440 aactgctgga agttcttcca cagtcaggtc aattttgtcaaacccttctc tgtacccata 1500 cagcagcagc ctagcaactc tgctggtgat gggagttgtattttcagtct tcgccaggtc 1560 attgagatcc atccactcac atcttaagca ttcttcctggcaaaaattta tggtgaatga 1620 atatggcttt aggcggcaga tgatatacat atctgacttcccaaaagctc caggatttgt 1680 gtgctgttgc cgaatactca ggacggacct gaattctgattttataccag tctcttcaaa 1740 aacttctcga accgctgtgt ctcctacgta aaaaaagagatgtacaaatc aataataatt 1800 acacttttag aaactgtatc atcaaagatt ttcagttaaagtagcattat gtaaaggctc 1860 aaaacattac cctaacaaag taaagttttc aatacaaattctttgccttg tggatatcaa 1920 gaaatcccaa aatattttct taccactgta aattcaagaagcttttgaaa tgctgaatat 1980 ttctttggct gctacttgga ggcttatcta cctgtacatttttggggtca gctcttttta 2040 acttcttgct gctctttttc ccaaaaggta aaaatatagattgaaaagtt aaaacatttt 2100 gcatggctgc agttcctttg tttcttgaga taagattccaaagaacttag attcatttct 2160 tcaacaccga aatgctggag gtgtttgatc agttttcaagaaacttggaa tataaataat 2220 tttataattc aacaaaggtt ttcacatttt ataaggttgatttttcaatt aaatgcaaat 2280 ttgtgtggca ggatttttat tgccattaac atatttttgtggctgctttt tctacacatc 2340 cagatggtcc ctctaactgg gctttctcta attttgtgatgttctgtcat tgtctcccaa 2400 agtatttagg agaagccctt taaaaagctg ccttcctctaccactttgct ggaaagcttc 2460 acaattgtca cagacaaaga tttttgttcc aatactcgttttgcctctat ttttcttgtt 2520 tgtcaaatag taaatgatat ttgcccttgc agtaattctactggtgaaaa acatgcaaag 2580 aagaggaagt cacagaaaca tgtctcaatt cccatgtgctgtgactgtag actgtcttac 2640 catagactgt cttacccatc ccctggatat gctcttgttttttccctcta atagctatgg 2700 aaagatgcat agaaagagta taatgtttta aaacataaggcattcatctg ccatttttca 2760 attacatgct gacttccctt acaattgaga tttgcccataggttaaacat ggttagaaac 2820 aactgaaagc ataaaagaaa aatctaggcc gggtgcagtggctcatgcct atattccctg 2880 cactttggga ggccaaagca ggaggatcgc ttgagcccaggagttcaaga ccaacctggt 2940 gaaaccccgt ctctacaaaa aaacacaaaa aatagccaggcatggtggcg tgtacatgtg 3000 gtctcagata cttgggaggc tgaggtggga gggttgatcacttgaggctg agaggtcaag 3060 gttgcagtga gccataatcg tgccactgca gtccagcctaggcaacagag tgagactttg 3120 tctcaaaaaa agagaaattt tccttaataa gaaaagtaatttttactctg atgtgcaata 3180 catttgttat taaatttatt atttaagatg gtagcactagtcttaaattg tataaaatat 3240 cccctaacat gtttaaatgt ccatttttat tcattatgctttgaaaaata attatgggga 3300 aatacatgtt tgttattaaa tttattatta aagatagtagcactagtctt aaatttgata 3360 taacatctcc taacttgttt aaatgtccat ttttattctttatgcttgaa aataaattat 3420 ggggatccta tttagctctt agtaccacta atcaaaagttcggcatgtag ctcatgatct 3480 atgctgtttc tatgtcgtgg aagcaccgga tgggggtagtgagcaaatct gccctgctca 3540 gcagtcacca tagcagctga ctgaaaatca gcactgcctgagtagttttg atcagtttaa 3600 cttgaatcac taactgactg aaaattgaat gggcaaataagtgcttttgt ctccagagta 3660 tgcgggagac ccttccacct caagatggat atttcttccccaaggatttc aagatgaatt 3720 gaaattttta atcaagatag tgtgctttat tctgttgtattttttattat tttaatatac 3780 tgtaagccaa actgaaataa catttgctgt tttataggtttgaagaacat aggaaaaact 3840 aagaggtttt gtttttattt ttgctgatga agagatatgtttaaatatgt tgtattgttt 3900 tgtttagtta caggacaata atgaaatgga gtttatatttgttatttcta ttttgttata 3960 tttaataata gaattagatt gaaataaaat ataatgggaaataatctgca gaatgtgggt 4020 ttcctggtgt ttcctctgac tctagtgcac tgatgatctctgataaggct cagctgcttt 4080 atagttctct ggctaatgca gcagatactc ttcctgccagtggtaatacg attttttaag 4140 aaggcagttt gtcaatttta atcttgtgga tacctttatactcttagggt attattttat 4200 acaaaagcct tgaggattgc attctatttt ctatatgaccctcttgatat ttaaaaaaca 4260 ctatggataa caattcttca tttacctagt attatgaaagaatgaaggag ttcaaacaaa 4320 tgtgtttccc agttaactag ggtttactgt ttgagccaatataaatgttt aactgtttgt 4380 gatggcagta ttcctaaagt acattgcatg ttttcctaaatacagagttt aaataatttc 4440 agtaattctt agatgattca gcttcatcat taagaatatcttttgtttta tgttgagtta 4500 gaaatgcctt catatagaca tagtctttca gacctctactgtcagttttc atttctagct 4560 gctttcaggg ttttatgaat tttcaggcaa agctttaatttatactaagc ttaggaagta 4620 tggctaatgc caacggcagt ttttttcttc ttaattccacatgactgagg catatatgat 4680 ctctgggtag gtgagttgtt gtgacaacca caagcacttttttttttttt aaagaaaaaa 4740 aggtagtgaa tttttaatca tctggacttt aagaaggattctggagtata cttaggcctg 4800 aaattatata tatttggctt ggaaatgtgt ttttcttcaattacatctac aagtaagtac 4860 agctgaaatt cagaggaccc ataagagttc acatgaaaaaaatcaattca tttgaaaagg 4920 caagatgcag gagagaggaa gccttgcaaa cctgcagactgctttttgcc caatatagat 4980 tgggtaaggc tgcaaaacat aagcttaatt agctcacatgctctgctctc acgtggcacc 5040 agtggatagt gtgagagaat taggctgtag aacaaatggccttctctttc agcattcaca 5100 ccactacaaa atcatctttt atatcaacag aagaataagcataaactaag caaaaggtca 5160 ataagtacct gaaaccaaga ttggctagag atatatcttaatgcaatcca ttttctgatg 5220 gattgttacg agttggctat ataatgtatg tatggtattttgatttgtgt aaaagtttta 5280 aaaatcaagc tttaagtaca tggacatttt taaataaaatatttaaagac aatttagaaa 5340 attgccttaa tatcattgtt ggctaaatag aataggggacatgcatatta aggaaaaggt 5400 catggagaaa taatattggt atcaaacaaa tacattgatttgtcatgata cacattgaat 5460 ttgatccaat agtttaagga ataggtagga aaatttggtttctatttttc gatttcctgt 5520 aaatcagtga cataaataat tcttagctta ttttatatttccttgtctta aatactgagc 5580 tcagtaagtt gtgttagggg attatttctc agttgagactttcttatatg acattttact 5640 atgttttgac ttcctgacta ttaaaaataa atagtagaaacaattttcat aaagtgaaga 5700 attatataat cactgcttta taactgactt tattatatttatttcaaagt tcatttaaag 5760 gctactattc atcctctgtg atggaatggt caggaatttgttttctcata gtttaattcc 5820 aacaacaata ttagtcgtat ccaaaataac ctttaatgctaaactttact gatgtatatc 5880 caaagcttct ccttttcaga cagattaatc cagaagcagtcataaacaga agaataggtg 5940 gtatgttcct aatgatatta tttctactaa tggaataaactgtaatatta gaaattatgc 6000 tgctaattat atcagctctg aggtaatttc tgaaatgttcagactcagtc ggaacaaatt 6060 ggaaaattta aatttttatt cttagctata aagcaagaaagtaaacacat taatttcctc 6120 aacattttta agccaattaa aaatataaaa gatacacaccaatatcttct tcaggctctg 6180 acaggcctcc tggaaacttc cacatatttt tcaactgcagtataaagtca gaaaataaag 6240 ttaacataac tttcactaac acacacatat gtagatttcacaaaatccac ctataattgg 6300 tcaaagtggt tgagaatata ttttttagta attgcatgcaaaatttttct agcttccatc 6360 ctttctccct cgtttcttct ttttttgggg gagctggtaactgatgaaat cttttcccac 6420 cttttctctt caggaaatat aagtggtttt gtttggttaacgtgatacat tctgtatgaa 6480 tgaaacattg gagggaaaca tctactgaat ttctgtaatttaaaatattt tgctgctagt 6540 taactatgaa cagatagaag aatcttacag atgctgctataaataagtag aaaatataaa 6600 tttcatcact aaaatatgct attttaaaat ctatttcctatattgtattt ctaatcagat 6660 gtattactct tattatttct attgtatgtg ttaatgattttatgtaaaaa tgtaattgct 6720 tttcatgagt agtatgaata aaattgatta gtttgtgttttcttgtctcc cgaaaaaaaa 6780 aaaaaaaaaa aaaaaaaaaa aa 6802 32 2499 DNAHomo sapiens 32 agatgcgagc actgcggctg ggcgctgagg atcagccgct tcctgcctggattccacagc 60 ttcgcgccgt gtactgtcgc cccatccctg cgcgcccagc ctgccaagcagcgtgccccg 120 gttgcaggcg tcatgcagcg ggcgcgaccc acgctctggg ccgctgcgctgactctgctg 180 gtgctgctcc gcgggccgcc ggtggcgcgg gctggcgcga gctcggggggcttgggtccc 240 gtggtgcgct gcgagccgtg cgacgcgcgt gcactggccc agtgcgcgcctccgcccgcc 300 gtgtgcgcgg agctggtgcg cgagccgggc tgcggctgct gcctgacgtgcgcactgagc 360 gagggccagc cgtgcggcat ctacaccgag cgctgtggct ccggccttcgctgccagccg 420 tcgcccgacg aggcgcgacc gctgcaggcg ctgctggacg gccgcgggctctgcgtcaac 480 gctagtgccg tcagccgcct gcgcgcctac ctgctgccag cgccgccagctccaggaaat 540 gctagtgagt cggaggaaga ccgcagcgcc ggcagtgtgg agagcccgtccgtctccagc 600 acgcaccggg tgtctgatcc caagttccac cccctccatt caaagataatcatcatcaag 660 aaagggcatg ctaaagacag ccagcgctac aaagttgact acgagtctcagagcacagat 720 acccagaact tctcctccga gtccaagcgg gagacagaat atggtccctgccgtagagaa 780 atggaagaca cactgaatca cctgaagttc ctcaatgtgc tgagtcccaggggtgtacac 840 attcccaact gtgacaagaa gggattttat aagaaaaagc agtgtcgcccttccaaaggc 900 aggaagcggg gcttctgctg gtgtgtggat aagtatgggc agcctctcccaggctacacc 960 accaagggga aggaggacgt gcactgctac agcatgcaga gcaagtagacgcctgccgca 1020 aggttaatgt ggagctcaaa tatgccttat tttctacaaa agactgccaaggacatgacc 1080 agcagctggc tacagcctcg atttatattt ctgtttgtgg tgaactgattttttttaaac 1140 caaagtttag aaagaggttt ttgaaatgcc tatggtttct ttgaatggtaaacttgagca 1200 tcttttcact ttccagtagt cagcaaagag cagtttgaat tttcttgtcgcttcctatca 1260 aaatatctag agactcgagc acagcaccca gacttcatgc gcccgtggaatgctcaccac 1320 atgttggtcg aagcggccga ccactgactt tgtgacttag gcggctgtgttgcctatgta 1380 gagaacacgc ttcaccccca ctccctgtac agtgcgcaca ggctttatcgagaataggaa 1440 aacctttaaa ccccggtcat ccggacatcc caacgcatgc tcctggagctcacagccttc 1500 tgtggtgtca tttctgaaac aagggcgtgg atccctcaac ccagaagagtgtttatgtct 1560 tcaagtgacc tgtactgctt ggggactatt tgagaaaata aggtggagtcctacttgttt 1620 cacaaatatg tatctaagaa tgttctaggg cactctggga acctataaaggcaggtattt 1680 cgggccctcc tcttcaggaa tcttcctgaa gacatggccc agtcgaaggcccaggatggc 1740 ttttgctgcg gccccgtggg gtaggaggga cagagagaca gggagagtcagcctccacat 1800 tcagaggcat cacaagtaat ggcacaattc ttcggatgac tgcagaaaatagtgttttgt 1860 agttcaacaa ctcaagacga agcttatttc tgaggataag ctctttaaagacaaagcttt 1920 attttcatct ctcatctttt gtcctcctta gcacaatgca aaaaagaatagtaatatcag 1980 aacaggaagg aggaatggct tgctggggag cccatccagg acactgggagcacatagaga 2040 ttcacccatg tttgttgaac ttagagtcat tctcatgctt ttctttataattcacacata 2100 tatgcagaga agatatgttc ttgttaacat tgtatacaac atagccccaaatatagtaag 2160 atctatacta gataatccta gatgaaatgt tagagatgct atatgatacaactgtggcca 2220 tgactgagga aaggagctca cgcccagaga ctgggctgct ctcccggaggccaaacccaa 2280 gaaggtctgg caaagtcagg ctcagggaga ctctgccctg ctgcagacctcggtgtggac 2340 acacgctgca tagagctctc cttgaaaaca gaggggtctc aagacattctgcctacctat 2400 tagcttttct ttattttttt aactttttgg ggggaaaagt atttttgagaagtttgtctt 2460 gcaatgtatt tataaatagt aaataaagtt tttaccatt 2499 33 4114DNA Homo sapiens 33 attaattctg gctccacttg ttgctcggcc caggttggggagaggacgga gggtggccgc 60 agcgggttcc tgagtgaatt acccaggagg gactgagcacagcaccaact agagaggggt 120 cagggggtgc gggactcgag cgagcaggaa ggaggcagcgcctggcacca gggctttgac 180 tcaacagaat tgagacacgt ttgtaatcgc tggcgtgccccgcgcacagg atcccagcga 240 aaatcagatt tcctggtgag gttgcgtggg tggattaatttggaaaaaga aactgcctat 300 atcttgccat caaaaaactc acggaggaga agcgcagtcaatcaacagta aacttaagag 360 acccccgatg ctcccctggt ttaacttgta tgcttgaaaattatctgaga gggaataaac 420 atcttttcct tcttccctct ccagaagtcc attggaatattaagcccagg agttgctttg 480 gggatggctg gaagtgcaat gtcttccaag ttcttcctagtggctttggc catatttttc 540 tccttcgccc aggttgtaat tgaagccaat tcttggtggtcgctaggtat gaataaccct 600 gttcagatgt cagaagtata tattatagga gcacagcctctctgcagcca actggcagga 660 ctttctcaag gacagaagaa actgtgccac ttgtatcaggaccacatgca gtacatcgga 720 gaaggcgcga agacaggcat caaagaatgc cagtatcaattccgacatcg acggtggaac 780 tgcagcactg tggataacac ctctgttttt ggcagggtgatgcagatagg cagccgcgag 840 acggccttca catacgccgt gagcgcagca ggggtggtgaacgccatgag ccgggcgtgc 900 cgcgagggcg agctgtccac ctgcggctgc agccgcgccgcgcgccccaa ggacctgccg 960 cgggactggc tctggggcgg ctgcggcgac aacatcgactatggctaccg ctttgccaag 1020 gagttcgtgg acgcccgcga gcgggagcgc atccacgccaagggctccta cgagagtgct 1080 cgcatcctca tgaacctgca caacaacgag gccggccgcaggacggtgta caacctggct 1140 gatgtggcct gcaagtgcca tggggtgtcc ggctcatgtagcctgaagac atgctggctg 1200 cagctggcag acttccgcaa ggtgggtgat gccctgaaggagaagtacga cagcgcggcg 1260 gccatgcggc tcaacagccg gggcaagttg gtacaggtcaacagccgctt caactcgccc 1320 accacacaag acctggtcta catcgacccc agccctgactactgcgtgcg caatgagagc 1380 accggctcgc tgggcacgca gggccgcctg tgcaacaagacgtcggaggg catggatggc 1440 tgcgagctca tgtgctgcgg ccgtgggtac gaccagttcaagaccgtgca gacggagcgc 1500 tgccactgca agttccactg gtgctgctac gtcaagtgcaagaagtgcac ggagatcgtg 1560 gaccagtttg tgtgcaagta gtgggtgcca cccagcactcagccccgctc ccaggacccg 1620 cttatttata gaaagtacag tgattctggt ttttggtttttagaaatatt ttttattttt 1680 ccccaagaat tgcaaccgga accatttttt ttcctgttaccatctaagaa ctctgtggtt 1740 tattattaat attataatta ttatttggca ataatgggggtgggaaccac gaaaaatatt 1800 tattttgtgg atctttgaaa aggtaataca agacttcttttggatagtat agaatgaagg 1860 gggaaataac acatacccta acttagctgt gtgggacatggtacacatcc agaaggtaaa 1920 gaaatacatt ttctttttct caaatatgcc atcatatgggatgggtaggt tccagttgaa 1980 agagggtggt agaaatctat tcacaattca gcttctatgaccaaaatgag ttgtaaattc 2040 tctggtgcaa gataaaaggt cttgggaaaa caaaacaaaacaaaacaaac ctcccttccc 2100 cagcagggct gctagcttgc tttctgcatt ttcaaaatgataatttacaa tggaaggaca 2160 agaatgtcat attctcaagg aaaaaaggta tatcacatgtctcattctcc tcaaatattc 2220 catttgcaga cagaccgtca tattctaata gctcatgaaatttgggcagc agggaggaaa 2280 gtccccagaa attaaaaaat ttaaaactct tatgtcaagatgttgatttg aagctgttat 2340 aagaattggg attccagatt tgtaaaaaga cccccaatgattctggacac tagatttttt 2400 gtttggggag gttggcttga acataaatga aatatcctgtattttcttag ggatacttgg 2460 ttagtaaatt ataatagtag aaataataca tgaatcccattcacaggttt ctcagcccaa 2520 gcaacaaggt aattgcgtgc cattcagcac tgcaccagagcagacaacct atttgaggaa 2580 aaacagtgaa atccaccttc ctcttcacac tgagccctctctgattcctc cgtgttgtga 2640 tgtgatgctg gccacgtttc caaacggcag ctccactgggtcccctttgg ttgtaggaca 2700 ggaaatgaaa cattaggagc tctgcttgga aaacagttcactacttaggg atttttgttt 2760 cctaaaactt ttattttgag gagcagtagt tttctatgttttaatgacag aacttggcta 2820 atggaattca cagaggtgtt gcagcgtatc actgttatgatcctgtgttt agattatcca 2880 ctcatgcttc tcctattgta ctgcaggtgt accttaaaactgttcccagt gtacttgaac 2940 agttgcattt ataagggggg aaatgtggtt taatggtgcctgatatctca aagtcttttg 3000 tacataacat atatatatat atacatatat ataaatataaatataaatat atctcattgc 3060 agccagtgat ttagatttac agcttactct ggggttatctctctgtctag agcattgttg 3120 tccttcactg cagtccagtt gggattattc caaaagttttttgagtcttg agcttgggct 3180 gtggccccgc tgtgatcata ccctgagcac gacgaagcaacctcgtttct gaggaagaag 3240 cttgagttct gactcactga aatgcgtgtt gggttgaagatatctttttt tcttttctgc 3300 ctcacccctt tgtctccaac ctccatttct gttcactttgtggagagggc attacttgtt 3360 cgttatagac atggacgtta agagatattc aaaactcagaagcatcagca atgtttctct 3420 tttcttagtt cattctgcag aatggaaacc catgcctattagaaatgaca gtacttatta 3480 attgagtccc taaggaatat tcagcccact acatagatagcttttttttt tttttttttt 3540 ttttaataag gacacctctt tccaaacagg ccatcaaatatgttcttatc tcagacttac 3600 gttgttttaa aagtttggaa agatacacat cttttcatacccccccttag gaggttgggc 3660 tttcatatca cctcagccaa ctgtggctct taatttattgcataatgata tccacatcag 3720 ccaactgtgg ctctttaatt tattgcataa tgatattcacatcccctcag ttgcagtgaa 3780 ttgtgagcaa aagatcttga aagcaaaaag cactaattagtttaaaatgt cacttttttg 3840 gtttttatta tacaaaaacc atgaagtact ttttttatttgctaaatcag attgttcctt 3900 tttagtgact catgtttatg aagagagttg agtttaacaatcctagcttt taaaagaaac 3960 tatttaatgt aaaatattct acatgtcatt cagatattatgtatatcttc tagcctttat 4020 tctgtacttt taatgtacat atttctgtct tgcgtgatttgtatatttca ctggtttaaa 4080 aaacaaacat cgaaaggctt attccaaatg gaag 4114 347694 DNA Homo sapiens 34 gcaacgaagg taccatggcc gttgtcgtcg ccgccgcggctcccggggct ggatgggggg 60 ccgaggccag ccagtggcac ccggaagaaa gagacgcggcggcggcgacg ccgacaccct 120 caggacgagt gtccggactt gcccacagcc tcaaggaggagacggcgagg cccggccccc 180 gctgtccctg gtgtaaagaa gtcgccgtag ccgtcgcggccgggactccc cgggctctcg 240 cccttcaggt ttcgttgaca ctcaggaccg tacgtacgctgcgccatgtt caagaaactg 300 aagcaaaaga tcagcgagga gcagcagcag ctccagcaggcgctggctcc tgctcaggcg 360 tcctccaatt cttcaacacc aacaagaatg aggagcaggacatcttcatt tacagagcaa 420 cttgatgaag gtacacccaa tagagagtca ggtgacacacagtcttttgc acagaagctc 480 cagctccggg tgccctccgt ggagtctttg tttcgaagtccgataaagga atctctattc 540 cggtcttctt ctaaagagtc tttggtacga acatcttccagagaatccct gaatcgactt 600 gacctggaca gttctactgc cagttttgat ccaccctctgatatggatag cgaggctgaa 660 gacttggtag ggaattcaga cagtctcaac aaagaacagttgattcagcg gttgcgaaga 720 atggaacgaa gcttaagtag ctacagggga aaatattctgagcttgttac agcttatcag 780 atgcttcaga gagagaagaa aaagctacaa ggtatattaagtcagagtca ggataaatca 840 cttcggagaa tagcagaatt aagagaggag ctccaaatggaccagcaggc aaagaaacat 900 ctgcaagagg agtttgatgc atctttagag gagaaagatcagtatatcag tgttctccaa 960 actcaggttt ctctactgaa acaacgatta cgaaatggcccgatgaatgt tgatgtactg 1020 aaaccacttc ctcagctgga accacaggct gaagtcttcactaaagaaga gaatccagaa 1080 agtgatggag agccagtagt ggaagatgga acttctgtaaaaacactgga aacactccag 1140 caaagagtga agcgtcaaga gaacctactt aagcgttgtaaggaaacaat tcagtcacat 1200 aaggaacaat gtacactatt aactagtgaa aaagaagctctgcaagaaca actggatgaa 1260 agacttcaag aactagaaaa gataaaggac cttcatatggccgagaagac taaacttatc 1320 actcagttgc gtgatgcaaa gaacttaatt gaacagcttgaacaagataa gggaatggta 1380 atcgcagaga caaaacgtca gatgcatgaa accctggaaatgaaagaaga agaaattgct 1440 caactccgta gtcgcatcaa acagatgact acccagggagaggaattacg ggaacagaaa 1500 gaaaagtccg aaagagctgc ttttgaggaa cttgaaaaagctttgagtac agcccaaaaa 1560 acagaggaag cacggagaaa actgaaggca gaaatggatgaacaaataaa aactatcgaa 1620 aaaacaagtg aggaggaacg catcagtctt caacaggaattaagtcgggt gaaacaggag 1680 gttgttgatg taatgaaaaa atcctcagaa gaacaaattgctaagctaca gaagcttcat 1740 gaaaaggagc tggccagaaa agagcaggaa ctgaccaagaagcttcagac ccgagaaagg 1800 gaatttcagg aacaaatgaa agtagctctt gaaaagagtcaatcagaata tttgaagatc 1860 agccaagaaa aagaacagca agaatctttg gccctagaagagttagagtt gcagaaaaaa 1920 gcaatcctca cagaaagtga aaataaactt cgggaccttcagcaagaagc agagacttac 1980 agaactagaa ttcttgaatt ggaaagttct ttggaaaaaagcttacaaga aaacaaaaat 2040 cagtcaaaag atttggctgt tcatctggaa gctgaaaaaaataagcacaa taaggagatt 2100 acagtcatgg ttgaaaaaca caagacagaa ttggaaagccttaagcatca gcaggatgcc 2160 ctttggactg aaaaactcca agtcttaaag caacaatatcagactgaaat ggaaaaactt 2220 agggaaaagt gtgaacaaga aaaagaaaca ttgttgaaagacaaagagat tatcttccag 2280 gcccacatag aagaaatgaa tgaaaagact ttagaaaagcttgatgtgaa gcaaacagaa 2340 ctagaatcat tatcttctga actgtcagaa gtattaaaagcccgtcacaa actagaagag 2400 gaactttctg ttctgaaaga tcaaacagat aaaatgaagcaggaattaga ggccaagatg 2460 gatgaacaga aaaatcatca ccagcagcaa gttgacagtatcattaaaga acacgaggta 2520 tctatccaga ggactgagaa ggcattaaaa gatcaaattaatcaacttga gcttctcttg 2580 aaggaaaggg acaagcattt gaaagagcat caggctcatgtagaaaattt agaggcagat 2640 attaaaaggt ctgaagggga actccagcag gcatctgctaagctggacgt ttttcagtct 2700 taccagagtg ccacacatga gcagacaaaa gcatatgaggaacagttggc ccaattgcag 2760 cagaagttgt tggatttgga aacagaaaga attcttcttaccaaacaggt tgctgaagtt 2820 gaagcacaaa agaaagatgt ttgtactgag ttagatgctcacaaaatcca ggtgcaggac 2880 ttaatgcagc aacttgaaaa acaaaatagt gaaatggagcaaaaagtaaa atctttaacc 2940 caagtctatg agtccaaact tgaagatggt aacaaagaacaggaacagac aaagcaaatc 3000 ttggtggaaa aggaaaatat gattttacaa atgagagaaggacagaagaa agaaattgag 3060 atactcacac agaaattgtc agccaaggag gacagtattcatattttgaa tgaggaatat 3120 gaaaccaaat ttaaaaacca agaaaaaaag atggaaaaagttaagcagaa agcaaaggag 3180 atgcaagaaa cgttaaagaa aaaattactg gatcaggaagccaaacttaa gaaagagctt 3240 gaaaatactg ctctagagct tagtcagaaa gaaaaacagtttaatgccaa aatgctggaa 3300 atggcacagg ctaactcagc tggaatcagt gatgcagtgtcaagactgga aacaaaccaa 3360 aaagaacaaa tagaaagtct tactgaggtt catcgacgagaactcaatga tgtcatatca 3420 atctgggaaa agaaacttaa tcagcaagct gaagaacttcaggaaataca tgaaatccaa 3480 ttacaggaaa aagaacaaga ggtagcagaa ctgaaacaaaagatcctcct atttgggtgt 3540 gaaaaagaag agatgaacaa ggaaataaca tggctgaaggaagaaggtgt taagcaggat 3600 acaacattaa atgaattaca ggaacagtta aagcagaagtctgcccatgt gaattctctt 3660 gcacaagatg aaactaaact gaaagctcat cttgaaaagctagaggttga cttgaataag 3720 tctctgaagg aaaatacttt tcttcaagag cagctagttgaactgaagat gctggcagaa 3780 gaagataagc ggaaggtttc tgagttgact agcaagttgaaaaccacaga tgaagaattc 3840 cagagtttga aatcttcaca tgaaaaaagt aacaaaagcctagaggacaa gagcttggaa 3900 tttaaaaaac tgtctgagga actagcgatt cagctagatatttgctgtaa gaaaaccgaa 3960 gccttattag aagctaaaac aaatgagcta atcaacattagtagtagtaa aactaatgcc 4020 attctttcta ggatttctca ttgtcagcac cgtacaactaaagttaagga ggcactgtta 4080 attaaaactt gcacagtttc tgaattagaa gcacaacttagacagttgac agaggagcaa 4140 aatacactaa atatttcttt tcaacaggct actcatcagttagaagaaaa agaaaatcaa 4200 attaagagca tgaaggctga tattgaaagt cttgtaacagaaaaagaagc cttacagaag 4260 gaaggaggca atcagcaaca ggctgcttct gaaaaggagtcttgtataac acagttgaag 4320 aaagagttat ctgaaaacat caatgctgtc acattgatgaaagaagagct taaagaaaaa 4380 aaagttgaga ttagcagtct tagtaaacaa ctaactgatttgaatgttca gcttcaaaat 4440 agcatcagcc tatccgaaaa agaagcagcc atttcatcactaagaaagca gtatgatgaa 4500 gaaaaatgtg aattgctgga tcaggtgcaa gatttatcttttaaagttga cactctgagt 4560 aaagagaaaa tttctgctct tgagcaggta gatgactggtccaataaatt ctcagaatgg 4620 aagaagaaag cacagtcaag atttacacag catcaaaacactgttaaaga attgcagatc 4680 cagcttgagt taaaatcaaa ggaagcttat gaaaaggatgagcagataaa tttattgaag 4740 gaagagcttg atcagcaaaa taaaagattt gattgtttaaagggtgaaat ggaagacgac 4800 aagagcaaga tggagaaaaa ggagtctaat ttagaaacagagttaaagtc tcaaacagca 4860 agaattatgg aattagagga ccatattacc cagaaaactattgaaataga gtccttaaat 4920 gaagttctta aaaattacaa tcaacaaaag gatattgaacacaaagaatt ggttcagaaa 4980 cttcaacatt ttcaagagtt aggagaagaa aaggacaacagggttaaaga agctgaagaa 5040 aaaatcttaa cacttgaaaa ccaagtttat tccatgaaagctgaacttga aactaagaag 5100 aaagaattag aacatgtgaa tttaagtgtg aaaagcaaagaggaggagtt aaaggcattg 5160 gaagataggc ttgagtcaga aagtgctgca aaattagcagagttgaagag aaaagctgaa 5220 caaaaaattg ctgccattaa gaagcagttg ttatctcaaatggaagagaa agaagaacag 5280 tataaaaaag gtacagaaag ccatttgagt gagctaaatacaaaattgca ggaaagagaa 5340 agggaagttc acatcttgga agaaaaactt aagtcagtggaaagttcaca gtcagaaaca 5400 ttaattgtac ccagatcagc aaaaaatgtg gcagcatatactgaacaaga agaagcagat 5460 tcccaaggct gtgtgcagaa gacatatgaa gaaaaaatcagtgttttaca aagaaactta 5520 actgaaaaag aaaagctatt gcagagggta gggcaggaaaaagaagagac agtttcttct 5580 cattttgaaa tgcgatgcca ataccaggag cgcttaataaagctagaaca tgctgaggca 5640 aagcaacatg aagatcaaag tatgataggt catcttcaagaggagcttga agaaaaaaac 5700 aagaaatatt ccttgatagt agcccagcat gtggaaaaagaaggaggtaa aaataacata 5760 caggcaaagc aaaacttgga aaatgtgttt gacgacgtccagaaaaccct ccaggagaag 5820 gaactaacct gtcagatttt ggagcaaaag ataaaagagctggattcctg cttagtaaga 5880 cagaaagaag tacatagagt tgaaatggaa gagttgacctcaaaatatga aaaattacag 5940 gctttacaac agatggatgg aagaaataaa cccacagaacttttggaaga aaacactgaa 6000 gaaaagtcca aatcacattt ggtccaaccc aaattgcttagtaacatgga agcccagcac 6060 aatgatctgg agtttaaatt agccggggca gaacgggagaaacagaaact gggcaaggag 6120 attgttagat tgcagaaaga ccttcgaatg ttgagaaaggagcatcagca agaattggaa 6180 atactaaaga aagaatatga tcaagaaagg gaagagaaaatcaaacagga gcaggaagat 6240 cttgaactga agcacaattc cacattaaaa cagctgatgagggagtttaa tacacagctg 6300 gcacaaaagg aacaagagct ggaaatgacc ataaaagaaactatcaataa ggcccaggag 6360 gtggaggctg aacttttaga aagccatcaa gaagagacaaatcagttact taaaaaaatt 6420 gctgagaaag atgatgatct aaaacgaaca gccaaaagatatgaagaaat ccttgatgct 6480 cgtgaagaag aaatgactgc aaaagtaagg gacctgcagactcaacttga ggagctgcag 6540 aagaaatacc agcaaaagct agagcaggag gagaaccctggcaatgataa tgtaacaatt 6600 atggagctac agacacagct agcacagaag acgactttaatcagtgattc gaaattgaaa 6660 gagcaagagt tcagagaaca gattcacaat ttagaagaccgtttgaagaa atatgaaaag 6720 aatgtatatg caacaactgt ggggacacct tacaaaggtggcaatttgta ccatacggat 6780 gtctcactct ttggagaacc taccgaattt gagtatttgcgaaaagtgct ttttgagtat 6840 atgatgggtc gtgagactaa gaccatggca aaagttataaccaccgtact gaagttccct 6900 gatgatcaga ctcagaaaat tttggaaaga gaagatgctcggctgatgtt tacttcacct 6960 cgcagtggta tcttctgagt aaaccatcag tctgtgcttagttaacatgt gtcatggctc 7020 cgatcttcat cttgaagaag agtgacattg ggtgactgctgcttggaaaa ctgtccacac 7080 ttgctactct ttgagaatga agttgtcatt cagggcccctcatgtagcca aaagaccaag 7140 aaaaatctgg cccacagata agttgcagac tgcctttaaaatagatttta tcagtggaga 7200 aatggtgata gttttttctt cagttttctc ttgggaaggagttttatgtt gtttaaaaga 7260 tattttgata acttaacctg ctttatgggc ttacataatattcctttcat ccattctttt 7320 taaagaacgg cttacctttc ctatttattt ttagggtgattttttaaaaa gacttgtgca 7380 atacattttg aggtgaaact tagtggattt tttctgataaattagagcat ttaattgact 7440 attttattca ggttgatctg ttgaatattt gctaaagaccagttctttaa gctaagacat 7500 gtaaaaaatc ccaaatggca gtacctcatt gtttacttagcttttgtact tatatttttc 7560 agaggaaaaa acactactgt aaattgtgaa tagccaatacataactgtat tgtatgcaaa 7620 tctgtgattg ttggcagtgt catctctgag aaacagataaataaagttta tttactataa 7680 aaaaaaaaaa aaaa 7694 35 5011 DNA Homo sapiens35 ccaggcggcg ttgcggcccc ggccccggct ccctgcgccg ccgccgccgc cgccgccgcc 60gccgccgccg ccgccgccag cgctagcgcc agcagccggg cccgatcacc cgccgcccgg 120tgcccgccgc cgcccgcgcc agcaaccggg cccgatcacc cgccgcccgg tgcccgccgc 180cgcccgcgcc accggcatgg cgctccgggg cttctgcagc gccgatggct ccgacccgct 240ctgggactgg aatgtcacgt ggaataccag caaccccgac ttcaccaagt gctttcagaa 300cacggtcctc gtgtgggtgc cttgttttta cctctgggcc tgtttcccct tctacttcct 360ctatctctcc cgacatgacc gaggctacat tcagatgaca cctctcaaca aaaccaaaac 420tgccttggga tttttgctgt ggatcgtctg ctgggcagac ctcttctact ctttctggga 480aagaagtcgg ggcatattcc tggccccagt gtttctggtc agcccaactc tcttgggcat 540caccacgctg cttgctacct ttttaattca gctggagagg aggaagggag ttcagtcttc 600agggatcatg ctcactttct ggctggtagc cctagtgtgt gccctagcca tcctgagatc 660caaaattatg acagccttaa aagaggatgc ccaggtggac ctgtttcgtg acatcacttt 720ctacgtctac ttttccctct tactcattca gctcgtcttg tcctgtttct cagatcgctc 780acccctgttc tcggaaacca tccacgaccc taatccctgc ccagagtcca gcgcttcctt 840cctgtcgagg atcaccttct ggtggatcac agggttgatt gtccggggct accgccagcc 900cctggagggc agtgacctct ggtccttaaa caaggaggac acgtcggaac aagtcgtgcc 960tgttttggta aagaactgga agaaggaatg cgccaagact aggaagcagc cggtgaaggt 1020tgtgtactcc tccaaggatc ctgcccagcc gaaagagagt tccaaggtgg atgcgaatga 1080ggaggtggag gctttgatcg tcaagtcccc acagaaggag tggaacccct ctctgtttaa 1140ggtgttatac aagacctttg ggccctactt cctcatgagc ttcttcttca aggccatcca 1200cgacctgatg atgttttccg ggccgcagat cttaaagttg ctcatcaagt tcgtgaatga 1260cacgaaggcc ccagactggc agggctactt ctacaccgtg ctgctgtttg tcactgcctg 1320cctgcagacc ctcgtgctgc accagtactt ccacatctgc ttcgtcagtg gcatgaggat 1380caagaccgct gtcattgggg ctgtctatcg gaaggccctg gtgatcacca attcagccag 1440aaaatcctcc acggtcgggg agattgtcaa cctcatgtct gtggacgctc agaggttcat 1500ggacttggcc acgtacatta acatgatctg gtcagccccc ctgcaagtca tccttgctct 1560ctacctcctg tggctgaatc tgggcccttc cgtcctggct ggagtggcgg tgatggtcct 1620catggtgccc gtcaatgctg tgatggcgat gaagaccaag acgtatcagg tggcccacat 1680gaagagcaaa gacaatcgga tcaagctgat gaacgaaatt ctcaatggga tcaaagtgct 1740aaagctttat gcctgggagc tggcattcaa ggacaaggtg ctggccatca ggcaggagga 1800gctgaaggtg ctgaagaagt ctgcctacct gtcagccgtg ggcaccttca cctgggtctg 1860cacgcccttt ctggtggcct tgtgcacatt tgccgtctac gtgaccattg acgagaacaa 1920catcctggat gcccagacag ccttcgtgtc tttggccttg ttcaacatcc tccggtttcc 1980cctgaacatt ctccccatgg tcatcagcag catcgtgcag gcgagtgtct ccctcaaacg 2040cctgaggatc tttctctccc atgaggagct ggaacctgac agcatcgagc gacggcctgt 2100caaagacggc gggggcacga acagcatcac cgtgaggaat gccacattca cctgggccag 2160gagcgaccct cccacactga atggcatcac cttctccatc cccgaaggtg ctttggtggc 2220cgtggtgggc caggtgggct gcggaaagtc gtccctgctc tcagccctct tggctgagat 2280ggacaaagtg gaggggcacg tggctatcaa gggctccgtg gcctatgtgc cacagcaggc 2340ctggattcag aatgattctc tccgagaaaa catccttttt ggatgtcagc tggaggaacc 2400atattacagg tccgtgatac aggcctgtgc cctcctccca gacctggaaa tcctgcccag 2460tggggatcgg acagagattg gcgagaaggg cgtgaacctg tctgggggcc agaagcagcg 2520cgtgagcctg gcccgggccg tgtactccaa cgctgacatt tacctcttcg atgatcccct 2580ctcagcagtg gatgcccatg tgggaaaaca catctttgaa aatgtgattg gccccaaggg 2640gatgctgaag aacaagacgc ggatcttggt cacgcacagc atgagctact tgccgcaggt 2700ggacgtcatc atcgtcatga gtggcggcaa gatctctgag atgggctcct accaggagct 2760gctggctcga gacggcgcct tcgctgagtt cctgcgtacc tatgccagca cagagcagga 2820gcaggatgca gaggagaacg gggtcacggg cgtcagcggt ccagggaagg aagcaaagca 2880aatggagaat ggcatgctgg tgacggacag tgcagggaag caactgcaga gacagctcag 2940cagctcctcc tcctatagtg gggacatcag caggcaccac aacagcaccg cagaactgca 3000gaaagctgag gccaagaagg aggagacctg gaagctgatg gaggctgaca aggcgcagac 3060agggcaggtc aagctttccg tgtactggga ctacatgaag gccatcggac tcttcatctc 3120cttcctcagc atcttccttt tcatgtgtaa ccatgtgtcc gcgctggctt ccaactattg 3180gctcagcctc tggactgatg accccatcgt caacgggact caggagcaca cgaaagtccg 3240gctgagcgtc tatggagccc tgggcatttc acaagggatc gccgtgtttg gctactccat 3300ggccgtgtcc atcgggggga tcttggcttc ccgctgtctg cacgtggacc tgctgcacag 3360catcctgcgg tcacccatga gcttctttga gcggaccccc agtgggaacc tggtgaaccg 3420cttctccaag gagctggaca cagtggactc catgatcccg gaggtcatca agatgttcat 3480gggctccctg ttcaacgtca ttggtgcctg catcgttatc ctgctggcca cgcccatcgc 3540cgccatcatc atcccgcccc ttggcctcat ctacttcttc gtccagaggt tctacgtggc 3600ttcctcccgg cagctgaagc gcctcgagtc ggtcagccgc tccccggtct attcccattt 3660caacgagacc ttgctggggg tcagcgtcat tcgagccttc gaggagcagg agcgcttcat 3720ccaccagagt gacctgaagg tggacgagaa ccagaaggcc tattacccca gcatcgtggc 3780caacaggtgg ctggccgtgc ggctggagtg tgtgggcaac tgcatcgttc tgtttgctgc 3840cctgtttgcg gtgatctcca ggcacagcct cagtgctggc ttggtgggcc tctcagtgtc 3900ttactcattg caggtcacca cgtacttgaa ctggctggtt cggatgtcat ctgaaatgga 3960aaccaacatc gtggccgtgg agaggctcaa ggagtattca gagactgaga aggaggcgcc 4020ctggcaaatc caggagacag ctccgcccag cagctggccc caggtgggcc gagtggaatt 4080ccggaactac tgcctgcgct accgagagga cctggacttc gttctcaggc acatcaatgt 4140cacgatcaat gggggagaaa aggtcggcat cgtggggcgg acgggagctg ggaagtcgtc 4200cctgaccctg ggcttatttc ggatcaacga gtctgccgaa ggagagatca tcatcgatgg 4260catcaacatc gccaagatcg gcctgcacga cctccgcttc aagatcacca tcatccccca 4320ggaccctgtt ttgttttcgg gttccctccg aatgaacctg gacccattca gccagtactc 4380ggatgaagaa gtctggacgt ccctggagct ggcccacctg aaggacttcg tgtcagccct 4440tcctgacaag ctagaccatg aatgtgcaga aggcggggag aacctcagtg tcgggcagcg 4500ccagcttgtg tgcctagccc gggccctgct gaggaagacg aagatccttg tgttggatga 4560ggccacggca gccgtggacc tggaaacgga cgacctcatc cagtccacca tccggacaca 4620gttcgaggac tgcaccgtcc tcaccatcgc ccaccggctc aacaccatca tggactacac 4680aagggtgatc gtcttggaca aaggagaaat ccaggagtac ggcgccccat cggacctcct 4740gcagcagaga ggtcttttct acagcatggc caaagacgcc ggcttggtgt gagccccaga 4800gctggcatat ctggtcagaa ctgcagggcc tatatgccag cgcccaggga ggagtcagta 4860cccctggtaa accaagcctc ccacactgaa accaaaacat aaaaaccaaa cccagacaac 4920caaaacatat tcaaagcagc agccaccgcc atccggtccc ctgcctggaa ctggctgtga 4980agacccagga gagacagaga tgcgaaccac c 5011 36 2007 DNA Homo sapiens 36tttaataacc atggactcca agtacagcag caacagcaaa ggaatctctc actacatgaa 60tacatgagta tggaattatt gcaagaagct ggtgtctccg ttcccaaagg atatgtggca 120aagtcaccag atgaagctta tgcaattgcc aaaaaattag gttcaaaaga tgtcgtgata 180aaggcacagg ttttagctgg tggtagagga aaaggaacat ttgaaagtgg cctcaaagga 240ggagtgaaga tagttttctc tccagaagaa gcaaaagctg tttcttcaca aatgattggg 300aaaaaattgt ttaccaagca aacgggagaa aagggcagaa tatgcaatca agtattggtc 360tgtgagcgaa aatatcccag gagagaatac tactttgcaa taacaatgga aaggtcattt 420caaggtcctg tattaatagg aagttcacat ggtggtgtca acattgaaga tgttgctgct 480gagactcctg aagcaataat taaagaacct attgatattg aagaaggcat caaaaaggaa 540caagctctcc agcttgcaca gaagatggga tttccaccta atattgtgga atcagcagca 600gaaaacatgg tcaagcttta cagccttttt ctgaaatacg atgcaaccat gatagaaata 660aatccaatgg tggaagattc agatggagct gtattgtgta tggatgcaaa gatcaatttt 720gactctaatt cagcctatcg ccaaaagaaa atctttgatc tacaggactg gacccaggaa 780gatgaaaggg acaaagatgc tgctaaggca aatctcaact acattggcct cgatggaaat 840ataggctgcc tagtaaatgg tgctggtttg gctatggcca caatggatat aataaaactt 900catggaggga ctccagccaa cttccttgat gttggtggtg gtgctacagt ccatcaagta 960acagaagcat ttaagcttat cacttcagat aaaaaggtac tggctattct ggtcaacatt 1020tttggaggaa tcatgcgctg tgatgttatt gcacagggta tagtcatggc agtaaaagac 1080ttggaaatta aaatacctgt tgtggtacgg ttacaaggta cacgagtcga tgatgctaag 1140gcactgatag cggacagtgg acttaaaata cttgcttgtg atgacttgga tgaagctgct 1200agaatggttg taaagctctc tgaaatagtg accttagcga agcaagcaca tgtggatgtg 1260aaatttcagt tgccaatatg atctgaaaac ccagtggatg gctgaaggtg ttaaatgtgc 1320tataatcatt aagaatactg tgttctgtgt tattgttctt tttcttttta gtgtgtggag 1380attgtaattg ccatctaggc acacaaacat ttaaaaggat ttggactgca tttaattgta 1440ccattcagaa tggactgttt gtacgaagca tgtataatgc agttatcttc tttcttttgt 1500cgcagccagt cttttttgct tctcctacaa aacgtaactt gcaatttgcc agtttattat 1560tgttggatac aaagttcttc attgataaga gtcctataaa taagataaat acgaagataa 1620agctttattc tttagtgtta aaatacagta tatctaataa ctagcctcat tagtagagca 1680gtatattaaa acaatgtttt atgtaaaaag tgtttatctt cagcaccaaa tacatgataa 1740atgtatcaat cactatttat aaacagagct ttcaaacact cctcagaata ttcttctaag 1800tattttgatg aagtaacttt gtaattattt gaacattgtt ttaatcatta ggaaacactg 1860attaactgca agtcttcatg attctgtcat attaagaaac acctgtaggt ttgcttcaaa 1920taaaggcata tataccaagg acttacagac aaaattaaga atgtcaattt aagttaataa 1980aaatctccca atatgaaaaa aaaaaaa 2007 37 2680 DNA Homo sapiens 37cggaccgtgc aatggcccag cgtaagaatg ccaagagcag cggcaacagc agcagcagcg 60gctccggcag cggtagcacg agtgcgggca gcagcagccc cggggcccgg agagagacaa 120agcatggagg acacaagaat gggaggaaag gcggactctc aggaacttca ttcttcacgt 180ggtttatggt gattgcattg ctgggcgtct ggacatctgt agctgtcgtt tggtttgatc 240ttgttgacta tgaggaagtt ctaggaaaac taggaatcta tgatgctgat ggtgatggag 300attttgatgt ggatgatgcc aaagttttat taggacttaa agagagatct acttcagagc 360cagcagtccc gccagaagag gctgagccac acactgagcc cgaggagcag gttcctgtgg 420aggcagaacc ccagaatatc gaagatgaag caaaagaaca aattcagtcc cttctccatg 480aaatggtaca cgcagaacat gttgagggag aagacttgca acaagaagat ggacccacag 540gagaaccaca acaagaggat gatgagtttc ttatggcgac tgatgtagat gatagatttg 600agaccctgga acctgaagta tctcatgaag aaaccgagca tagttaccac gtggaagaga 660cagtttcaca agactgtaat caggatatgg aagagatgat gtctgagcag gaaaatccag 720attccagtga accagtagta gaagatgaaa gattgcacca tgatacagat gatgtaacat 780accaagtcta tgaggaacaa gcagtatatg aacctctaga aaatgaaggg atagaaatca 840cagaagtaac tgctccccct gaggataatc ctgtagaaga ttcacaggta attgtagaag 900aagtaagcat ttttcctgtg gaagaacagc aggaagtacc accagatact taaagcttca 960aaaagactgc ccctaccacc acaggaggac cagcctaacc atacgctcca aaagatggct 1020gtgatagatc ttgtgaagca attactgagc agatcaagat ctttgggaag gaacactaaa 1080gatgttttga atgaattata gtccactggc attttagtgt attttttttt ctttttacaa 1140acacacattt ctaaaaatgt catgttacat tcctgcatgt cccttttgat agcattagtg 1200gatccattgg atttcttttt tctttttgtg agacagcttt tagtcttacc tgaatttatg 1260tgtgtttttc cgacagtggt taataattat attggtgatg tagcagcaat tgtgttggca 1320gggttttcat atattattag taattaacac taactgttgg actgacttgt gtacactgtg 1380ttaaacatga tttaaaagct attaagagta ctttgtgtta gcactcttaa aaacgctaac 1440agagatcatc attagctgtg aagatttgag ttgtatatac ctgcactgat attcttatca 1500aaaatttcta cattagcttt aagtgttcag attaacactt ttgaaatttt tgtagctttt 1560agctgattaa ttagaaaaat taatatttca gtgaaagttt taaattatca tttatttatt 1620tttttaaatg agaggggaaa gctgaaattc cttgttaaga cacaaggaaa aagaatggcc 1680ctactattat catgcaaaaa tgctttgttg gcacctcaga ttaatcatat aatagctata 1740gtctcttcag catttgttta aattttagaa aacctgtata aattactggt gcataactta 1800aagattattc tgcctttggc taattgagta attcccctcc agcactagag accgctcagt 1860gctcttacta gatgaactca gtaacgcctt gagctgggtt gattgaggat gtgtgaaaag 1920ctcacagagc ccgatgcctg ctgctatttc acggcaatga gcctttttct ttctacactg 1980aagattttct tcttatttaa tgtggtttat tttgggctca gaaataattg ctctgttgaa 2040aataatcctt tgtcagaaaa gaaggtagct accacatcat tttgaaagga ccatgagcaa 2100ctataagcaa agccataaga agtggtttga tcgatatatt aggggtagct cttgattttg 2160ttaacattaa gataaggtga ctttttcccc ctgcttttag gattaaaatc aaagatactt 2220ctatattttt atcactatag atcatagtta ttatacaatg tagtgagtcc tgcatgggta 2280ctcgatgtgt aatgaaacct gaaataataa gataataaga aaagcaataa ttttctaaag 2340ctgtgctgtc ggtgatacag agacgatact caaattataa taaaactctt cattttgtga 2400attatagaag ctacttttta taaagccata tttttttagg gaaactaagg agtgacatag 2460aactgatgaa tgagcaaaag taagttttgc tggatttttg tagaactctg gacgttgagg 2520attcattatg ctgtggttaa ctttaaatat ttttgaattc caaatatctg aattaatgag 2580ccttgtgttt acaaatatgt gccattgtgc aacatcggtg gattttctaa aaataatgta 2640aatgtcttct attaaatgtt gagtgcaata aaatccagaa 2680 38 3164 DNA Homosapiens 38 cggcctcaga aagccgagtg aggagttggc cgtagtgaga gggaccgatcccttggggcc 60 gccggcggcg aggccgagcc gctcctccca atggcgaaga agacgtacgacctgcttttc 120 aagctgctcc tgatcgggga ttccggagtg gggaagacct gcgtcctttttcgtttttcg 180 gatgatgcct tcaatactac ctttatttcc accataggaa tagacttcaagatcaaaaca 240 gttgaattac aaggaaagaa gatcaagcta cagatatggg atacagcaggccaggagcga 300 tttcacacca tcacaacctc ctactacaga ggcgcaatgg gtatcatgctagtatatgac 360 atcaccaatg gtaaaagttt tgaaaacatc agcaaatggc ttagaaacatagatgagcat 420 gccaatgaag atgtggaaag aatgttacta ggaaacaagt gtgatatggacgacaaaaga 480 gttgtaccta aaggaaaagg agaacagatt gcaagggagc atggtattaggttttttgag 540 actagtgcaa aagcaaatat aaacatcgaa aaggcgttcc tcacgttagctgaagatatc 600 cttcgaaaga cccctgtaaa agagcccaac agtgaaaatg tagatatcagcagtggagga 660 ggcgtgacag gctggaagag caaatgctgc tgagcattct cctgttccatcagttgccat 720 ccactacccc gttttctctt cttgctgcaa aataaaccac tctgtccatttttaactcta 780 aacagatatt tttgtttctc atcttaacta tccaagccac ctattttatttgttctttca 840 tctgtgactg cttgctgact ttatcataat tttcttcaaa caaaaaaatgtatagaaaaa 900 tcatgtctgt gacttcattt ttaaatgtac ttgctcagct caactgcatttcagttgtat 960 tatagtccag ttcttatcaa cattaaaacc tatagcaatc atttcaaatctattctgcaa 1020 attgtataag aataaagtta gaattaacaa ttttattttg tacaacagtggaattttctg 1080 tcatggataa tgtgcttgag tccctataat ctatagacat gtgatagcaaaagaaacaaa 1140 caaaagccag gaaaacactc attttcgcct tgaatatgta aatgggattaattttgtcct 1200 gtgccttatg tggaaaggga cttctttggg ttttcctttt ttgttctggtggaagcatgt 1260 gcagggagac catatcatcc aaaccataaa cccattaaaa tgtttgtggtttgcttggct 1320 gtaattttca aagtagttaa ttgaggacaa agggtaatgc agaagtgatagctttggttt 1380 gctgagtctt gttttaagtg gccttgatat ttaaaactat tcctgccaccatttcttctc 1440 cttggccact tcttccttgc gtctccctgc atgctgcttt atttgcttctccctccccaa 1500 ccacctcatg gtatatttaa gagtgaaagg gacaaactag taggtttgtcaagtttaata 1560 taaagcactg atgtaacttg ctaggtaaac ggaaagataa gttctaactgcctactatcc 1620 aatgtcagtt aattggtgtc ttcccccctc atttgctctc ttccctaaaatgtgtcccag 1680 atgccttcat ttgctgtttt acttctatgt tctgcttttc ctcctctctttgttcccttc 1740 ctgtctatcc attgagttta tgaaatggaa gagttaactg catgcactagtgtttggagg 1800 gtgttgtggt ttgtctttct aattaggtgt atagcctatt cactttcctaggaataaatc 1860 tcttaaccta aatttgagta gtctgcattt tggcaactcc tctaagcagcttggtagcct 1920 aagtacaggt tgttttttta aaaaaggaaa agcaggaagg aggagtgaattttattaaca 1980 tgtttgccaa atgtattgag atttggcctc tgaagaacac tttttcagtgttaagtttct 2040 ttaccttaag attccgaaat actttagaat attattaatt ttaagtcctgtctttacatc 2100 cttttggaaa acttgtatta ccatgagttt ggaaaaagga caacgaaaggcttttcatgt 2160 aaagataaga tctttagcta tctctaaccc tgtccttttt tcactgcattttttctagtt 2220 ttgcttcatt gcttatcatt aggatagggt aagtgaagtt tgctatgctgctagcatcct 2280 aagatgatac ctttgttgaa agaattgtga atagcatgat tcatttctagcagaggctga 2340 gtttaggaca gcagcttcca ttgagaagtc tttctgtgtc gtgaatagcattttaatgac 2400 ctcttggctc acataagcaa acaacatagg gacgtatctg ctatgaaaatccacaaattt 2460 ttcagatagt gccctaaaaa caattttata tgcctcactg gttgttattcttaggttatt 2520 cccacacttg actttatcat tgtttactac tagtaaaaag cagcattgccaaataatccc 2580 taattttcca ctaaaaatat aatgaaatga tgttaagctt tttgaaaagtttaggttaaa 2640 cctactgttg ttagattaat gtatttgttg cttcccttta tctggaatgtggcattagct 2700 tttttatttt aaccctcttt aattcttatt caattccatg acttaaggttggagagctaa 2760 acactgggat ttttggataa cagactgaca gttttgcata attataatcggcattgtaca 2820 tagaaaggat atggctacct tttgttaaat ctgcactttc taaatatcaaaaaagggaaa 2880 tgaagtataa atcaattttt gtataatctg tttgaaacat gagttttatttgcttaatat 2940 tagctttgcc ccttttctgt aagtctcttg ggatcctgtg tagaagctgttctcattaaa 3000 caccaaacag ttaagtccat tctctggtac tagctacaaa ttcggtttcatattctactt 3060 aacaatttaa ataaactgaa atatttctag atggtctact tctgttcatataaaaacaaa 3120 acttgatttc caaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaa 3164 392376 DNA Homo sapiens 39 gtcgcctccg cctgatcccc ggcctgtcgg ccgaccccacctcgccaacc gaggcggacc 60 gcggagtgtg cgaacgaccc caccgctgct ttctcctcccccagatcacg caccccagct 120 ccggaagatg gggaactgcc tcaaatcccc cacctcggatgacatctccc tgcttcacga 180 gtctcagtcc gaccgggcta gctttggcga ggggacggagccggatcagg agccgccgcc 240 gccatatcag gaacaagttc cagttccagt ctaccacccaacacctagcc agactcggct 300 agcaactcag ctgactgaag aggaacaaat taggatagctcaaagaatag gtcttataca 360 acatctgcct aaaggagttt atgaccctgg aagagatggatcagaaaaaa agatccggga 420 gtgtgtgatc tgtatgatgg actttgttta tggggacccaattcgatttc tgccgtgcat 480 gcacatctat cacctggact gtatagatga ctggttgatgagatccttca cgtgcccctc 540 ctgcatggag ccagttgatg cagcactgct ttcatcctatgagactaatt gagccagggt 600 ctcttatctg acttcaagtg aaccaccatt ttggtggttttgatcttttg tcactgagcc 660 caaagagcca gggattagga attaagatcg tgcacaaaagtttccttaaa attcctggat 720 ggctgcagat gttgggggaa aaagtacgtg atattttagaaacttagtgg gaaaagtagg 780 atggtatttt tatgtaaagc cttgacccaa tgtttaaaaatataattgta tttagatctt 840 gttattgctc cagtacatag gaattgtgta aagtgttaacagcagctgta ttgtttaaat 900 tgtgtgtatt gaagattagg aaaaagatag tagttatttttcctaaatga aataactttc 960 ttctcttccc cttccccacc cgaattcttt tctgaagttgctggcatttg ggtcaaggtt 1020 ttattaaaag ctacatttta taacactggc acacacaaaaaagtagtttt aagcttgttt 1080 gcacagttct ttttttccat tggaaatgga attcattgccttaggtcttt ttaaatagtg 1140 tattattatc gttggggctg gctctatgct tgaaaaccagtttatttata acctgttata 1200 agtgctatat tctgtttgca gttaggaaat gcagaattcaaagtgatctc ctagcttgta 1260 agcaaactga gatgcactat cccttttcta taaaaaataagttaatgtgt caagaaacca 1320 actctattaa ggtggggttt aatattaccc tttcctatgtgttttatcta attattttgg 1380 ttgttaatat ggtgataatg gaaagtcaag ttaaattttaaatattaaga attctgattt 1440 attgagattg aattatgcca ccacgtttat gtaaaaatgaaggtggcacc gtggtgagac 1500 ctaatgagaa atagttactc agttgtaaaa attttgatttattctctttc ttctgacctc 1560 cttgcctctt gtcttgaacc atagcaaaag gatactgcatctctcattac tgtagtgctg 1620 aggttattga agttatacaa aacacatctc agtctctgtttcttggaaag gtatctatta 1680 catcctgcta gctgactgac aaaactaagc agggagaataaagataattg tattttatgt 1740 tttgcacaca aacgcagaat ttgtataacc atatgacttcatagttgtga tctcaaaaaa 1800 agaaggaatt tctcctttgt ttcttgcagt taatgtaagaatactttaaa tctctaagct 1860 tctgaagtgt tagaggtaga gatggtctag taaagatgtagtagtaatgt tttatccatt 1920 tagcatgtgt ttattttttc atatgtactc aaaggtgacttattggttca cctcagtgat 1980 attacagcta aaaaaatcat tcattagcaa aaggaaaagtggtctcaacc taacatcaga 2040 agtgtttctt attattattt tatattgagt tgaatattgaactctaacag ttttctacat 2100 acaaaacaca gtgtcatgaa ggttattcat aattgcattatagaggaatg tagtatgtca 2160 taagtacttt gtaaagattt gacattcaac tgtagtatccatatgttgct taaatttcct 2220 tatgagcccc atgatggaaa gacttaaaga tgaatttgagaaaaattgaa agaaattaga 2280 ttatcaggtt ctgttaaatt gttacatgta tcttgcttaaatttctgttt attaatttat 2340 atccacccaa gtacataaag caaatttgga ggaaac 237640 3198 DNA Homo sapiens 40 aacctgaata tccaggtgga ggacattcgg attcgagccatcctctcaac ctaccgcaag 60 cgcaccccag tgatggaggg ctacgtggag gtgaaggagggcaagacctg gaagcagatc 120 tgtgacaagc actggacggc caagaattcc cgcgtggtctgcggcatgtt tggcttccct 180 ggggagagga catacaatac caaagtgtac aaaatgtttgcctcacggag gaagcagcgc 240 tactggccat tctccatgga ctgcaccggc acagaggcccacatctccag ctgcaagctg 300 ggcccccagg tgtcactgga ccccatgaag aatgtcacctgcgagaatgg gcagccggcc 360 gtggtgagtt gtgtgcctgg gcaggtcttc agccctgacggaccctcgag attccggaaa 420 gcatacaagc cagagcaacc cctggtgcga ctgagaggcggtgcctacat cggggagggc 480 cgcgtggagg tgctcaaaaa tggagagtgg gggaccgtctgcgacgacaa gtgggacctg 540 gtgtcggcca gtgtggtctg cagagagctg ggctttgggagtgccaaaga ggcagtcact 600 ggctcccgac tggggcaagg gatcggaccc atccacctcaacgagatcca gtgcacaggc 660 aatgagaagt ccattataga ctgcaagttc aatgccgagtctcagggctg caaccacgag 720 gaggatgctg gtgtgagatg caacacccct gccatgggcttgcagaagaa gctgcgcctg 780 aacggcggcc gcaatcccta cgagggccga gtggaggtgctggtggagag aaacgggtcc 840 cttgtgtggg ggatggtgtg tggccaaaac tggggcatcgtggaggccat ggtggtctgc 900 cgccagctgg gcctgggatt cgccagcaac gccttccaggagacctggta ttggcacgga 960 gatgtcaaca gcaacaaagt ggtcatgagt ggagtgaagtgctcgggaac ggagctgtcc 1020 ctggcgcact gccgccacga cggggaggac gtggcctgcccccagggcgg agtgcagtac 1080 ggggccggag ttgcctgctc agaaaccgcc cctgacctggtcctcaatgc ggagatggtg 1140 cagcagacca cctacctgga ggaccggccc atgttcatgctgcagtgtgc catggaggag 1200 aactgcctct cggcctcagc cgcgcagacc gaccccaccacgggctaccg ccggctcctg 1260 cgcttctcct cccagatcca caacaatggc cagtccgacttccggcccaa gaacggccgc 1320 cacgcgtgga tctggcacga ctgtcacagg cactaccacagcatggaggt gttcacccac 1380 tatgacctgc tgaacctcaa tggcaccaag gtggcagagggccaaaaggc cagcttctgc 1440 ttggaggaca cagaatgtga aggagacatc cagaagaattacgagtgtgc caacttcggc 1500 gatcagggca tcaccatggg ctgctgggac atgtaccgccatgacatcga ctgccagtgg 1560 gttgacatca ctgacgtgcc ccctggagac tacctgttccaggttgttat taaccccaac 1620 ttcgaggttg cagaatccga ttactccaac aacatcatgaaatgcaggag ccgctatgac 1680 ggccaccgca tctggatgta caactcccac ataggtggttccttcagcga agagacggaa 1740 aaaaagtttg agcacttcag cgggctctta aacaaccagctgtccccgcc agtaaagaag 1800 cctgcgtggt caactcctgt cttcaggcca caccacatcttccatgggac ttctccccaa 1860 caactgagtc tgaacgaatg ccacgtgccc tcacccagcccggcccccac cctgtccaga 1920 cccctacagc tgtgtctaag ctcaggagga aagggaccctcccatcattc atggggggct 1980 gctacctgac ccttggggcc tgagaaggcc ttgcgggggtggggtttgtc cacagagctg 2040 ctggagcagc accaagagcc agtcttgacc gggatgaggcccacagacag gttgtcatca 2100 gcttgtccca ttcaagccac cgagctcacc acagacacagtggagccgcg ctcttctcca 2160 gtgacacgtg gacaaatgcg ggctcatcag cccccccagagagggtcagg ccgaacccca 2220 tttctcctcc tcttacctca ttttcagcaa acttgaatatctagacctct cttccaatga 2280 aaccctccag tctattatag tcacatagat aatggtgccacgtgttttct gatttggtga 2340 gctcagactt ggtgcttccc tatccacagc ccccaccccttgtttttcaa gatactatta 2400 ttatattttc acagactttt gaagcacaaa tttattggcatttaatattg gacatctggg 2460 cccttggaag tacaaatcta aggaaaaacc aacccactgtgtaagtgact catcttcctg 2520 ttgttccaat tctgtgggtt tttgattcaa cggtgctataaccagggtcc tgggtgacag 2580 ggagatacat gagcaccatg tgtcatcaca gacacttacacatacttgaa acttggaata 2640 aaagaaagat ttatgaaacg tgtctgtgtt tcctttgacccacagcacct gggccctgag 2700 cagcaggctt cctatgttca gtggccagaa gcagagcttcaggtacattc gtggttttct 2760 ccggtggaca tgggtcctca gatcccctcc agcccagtgtggccaccagg gcacctcctt 2820 caatagactc caaaaggggc agctcctacc atctgggagaagcaatctaa ggagatcaca 2880 aaaagtaacg gaacaggagt cataatcttt cttgaactcctgtggttttt actgaaactt 2940 gtcagaaggc ataggagttg tgcgagggct ggatgggaagtctagattta aacagccacc 3000 aggcagctta tcaaagcaag agggcatccg ttcacaggacaggggctccc agcaattccc 3060 agtggcagtg gggggtggct ggcccaagcc ccaagtcacccagacacagg ggacttcccc 3120 ttgtgtcaac agcatgctag ggcccagcaa actagagggtaggtaggacc accttggcac 3180 caactccact caaaccac 3198 41 5539 DNA Homosapiens 41 ggagggaggc cgggcaggcg gctgagcggc gcggctctca acgtgacggggaagtggttc 60 gggcggccgc ggcttactac cccagggcga acggacggac gacggaggcgggagccggta 120 gccgagccgg gcgacctaga gaacgagcgg gtcaggctca gcgtcggccactctgtcggt 180 ccgctgaatg aagtgcccgc ccctctaagc ccggagcccg gcgctttccccgcaagatgg 240 acggtttcgc cggcagtctc gatgatagta tttctgctgc aagtacttctgatgttcaag 300 atcgcctgtc agctcttgag tcacgagttc agcaacaaga agatgaaatcactgtgctaa 360 aggcggcttt ggctgatgtt ttgaggcgtc ttgcaatctc tgaagatcatgtggcctcag 420 tgaaaaaatc agtctcaagt aaaggccaac caagccctcg agcagttattcccatgtcct 480 gtataaccaa tggaagtggt gcaaacagaa aaccaagtca taccagtgctgtctcaattg 540 caggaaaaga aactctttca tctgctgcta aaagtggtac agaaaaaaagaaagaaaaac 600 cacaaggaca gagagaaaaa aaagaggaat ctcattctaa tgatcaaagtccacaaattc 660 gagcatcacc ttctccccag ccctcttcac aacctctcca aatacacagacaaactccag 720 aaagcaagaa tgctactccc accaaaagca taaaacgacc atcaccagctgaaaagtcac 780 ataattcttg ggaaaattca gatgatagcc gtaataaatt gtcgaaaataccttcaacac 840 ccaaattaat accaaaagtt accaaaactg cagacaagca taaagatgtcatcatcaacc 900 aagaaggaga atatattaaa atgtttatgc gcggtcggcc aattaccatgttcattcctt 960 ccgatgttga caactatgat gacatcagaa cggaactgcc tcctgagaagctcaaactgg 1020 agtgggcata tggttatcga ggaaaggact gtagagctaa tgtttaccttcttccgaccg 1080 gggaaatagt ttatttcatt gcatcagtag tagtactatt taattatgaggagagaactc 1140 agcgacacta cctgggccat acagactgtg tgaaatgcct tgctatacatcctgacaaaa 1200 ttaggattgc aactggacag atagctggcg tggataaaga tggaaggcctctacaacccc 1260 acgtcagagt gtgggattct gttactctat ccacactgca gattattggacttggcactt 1320 ttgagcgtgg agtaggatgc ctggattttt caaaagcaga ttcaggtgttcatttatgtg 1380 ttattgatga ctccaatgag catatgctta ctgtatggga ctggcagaagaaagcaaaag 1440 gagcagaaat aaagacaaca aatgaagttg ttttggctgt ggagtttcacccaacagatg 1500 caaataccat aattacatgc ggtaaatctc atattttctt ctggacctggagcggcaatt 1560 cactaacaag aaaacaggga atttttggga aatatgaaaa gccaaaatttgtgcagtgtt 1620 tagcattctt ggggaatgga gatgttctta ctggagactc aggtggagtcatgcttatat 1680 ggagcaaaac tactgtagag cccacacctg ggaaaggacc taaaggtgtatatcaaatca 1740 gcaaacaaat caaagctcat gatggcagtg tgttcacact ttgtcagatgagaaatggga 1800 tgttattaac tggaggaggg aaagacagaa aaataattct gtgggatcatgatctgaatc 1860 ctgaaagaga aatagaggtt cctgatcagt atggcacaat cagagctgtagcagaaggaa 1920 aggcagatca atttttagta ggcacatcac gaaactttat tttacgaggaacatttaatg 1980 atggcttcca aatagaagta cagggtcata cagatgagct ttggggtcttgccacacatc 2040 ccttcaaaga tttgctcttg acatgtgctc aggacaggca ggtgtgcctgtggaactcaa 2100 tggaacacag gctggaatgg accaggctgg tagatgaacc aggacactgtgcagattttc 2160 atccaagtgg cacagtggtg gccataggaa cgcactcagg caggtggtttgttctggatg 2220 cagaaaccag agatctagtt tctatccaca cagacgggaa tgaacagctctctgtgatgc 2280 gctactcaat agatggtacc ttcctggctg taggatctca tgacaactttatttacctct 2340 atgtagtctc tgaaaatgga agaaaatata gcagatatgg aaggtgcactggacattcca 2400 gctacatcac acaccttgac tggtccccag acaacaagta tataatgtctaactcgggag 2460 actatgaaat attgtactgg gacattccaa atggctgcaa actaatcaggaatcgatcgg 2520 attgtaagga cattgattgg acgacatata cctgtgtgct aggatttcaagtatttggtg 2580 tctggccaga aggatctgat gggacagata tcaatgcact ggtgcgatcccacaatagaa 2640 aggtgatagc tgttgccgat gacttttgta aagtccatct gtttcagtatccctgctcca 2700 aagcaaaggc tcccagtcac aagtacagtg cccacagcag ccatgtcaccaatgtcagtt 2760 ttactcacaa tgacagtcac ctgatatcaa ctggtggaaa agacatgagcatcattcagt 2820 ggaaacttgt ggaaaagtta tctttgcctc agaatgagac tgtagcggatactactctaa 2880 ccaaagcccc cgtctcttcc actgaaagtg tcatccaatc taatactcccacaccgcctc 2940 cttctcagcc cttaaatgag acagctgaag aggaaagtag aataagcagttctcccacac 3000 ttctggagaa cagcctggaa caaactgtgg agccaagtga agaccacagcgaggaggaga 3060 gtgaagaggg cagcggagac cttggtgagc ctctttatga agagccatgcaacgagataa 3120 gcaaggagca ggccaaagcc acccttctgg aggaccagca agacccttcgccctcgtcct 3180 aacaccctgg cttcagtgca actcttttcc ttcagctgca tgtgattttgtgataaagtt 3240 caggtaacag gatgggcagt gatggagaat cactgttgat tgagattttggtttccatgt 3300 gatttgtttt cttcaatagt cttattttca gtctctcaaa tacagccaacttaaagtttt 3360 agtttggtgt ttattgaaaa ttaaccaaac ttaatactag gagaagactgaatcattaat 3420 gatgtctcac aaattactgt gtacctaagt ggtgtgatgt aaatactggaaacaaaaaca 3480 gcagttgcat tgattttgaa aacaaacccc cttgttatct gaacatgttttcttcaggaa 3540 caaccagagg tatcacaaac actgttactc atctactggc tcagactgtactactttttt 3600 tttttttttt cctgaaaaag aaaccagaaa aaaatgtact cttactgagataccctctca 3660 ccccaaatgt gtaatggaaa atttttaatt aagaaaaact tcagttttgccaagtgcaat 3720 ggtgttgcct tctttaaaaa atgccgtttt cttacactac cagtggatgtccagacatgc 3780 tcttagtcta ctagagaggt gctgcctttt ctaagtcata atgaggaacagtcccttaat 3840 ttcttgtgtg caactctgtt ttatcctaga actaagagag cattggtttgttaaagagct 3900 ttcaatgtat attaaaacct tcaatactca gaaatgatgg attcctccaaggagtccttt 3960 actagcctaa acattctcaa atgtttgaga ttcaagtgaa tggaaggaaaaccacatgcc 4020 tttaaaacta aactgtaata attacctggc taatttcagc taagccttcatcataatttg 4080 ttccctcagt aataggagaa atataaatac agtaagttta gattattgaattggtgcttg 4140 aaatttattg gttttgttgt aattttatac agattatatg agggataagatactcatcaa 4200 attgcaaatt ctttttttta cagaagtgtg ggtaacagtc acagcagttttttttaccaa 4260 cagcatactt aacagacttg ctgtgtagca gtttttttct ggtggagttgctgtaagtct 4320 tgtaagtcta atgtggctat cctactcttt tgggcaatgc atgtattatgcattggaaag 4380 gtattttttt taagttctgt tggctagcta tggttttcag tacatttcctactttaagag 4440 taattactga caaatatgta tttcctatat gtttatactt tgattataaaaaagtatttt 4500 gttttgattt tttaacttgc tgcattgttt tgatactttc tatttttttggtcaaatcat 4560 gtttagaaac tttggatgag ttaagaagtc ttaagtatgc aggcgtttacgtgattgtgc 4620 cattccaaag tgcatcagaa ctgtcattcc cttctaatat cttctcaggagtaatacaaa 4680 tcaggtattt catcatcatt tggtaatatg aaaactccag tgaactcccaaggacattta 4740 caacatttat attcacacgc tgtatggaag ggtgtgggtg tgtgtgaaggggcgagtgga 4800 gacactgtgt gtatctctag ataagaagat atgcaccacg ttgaaaatactcagtgtaga 4860 tctctatgtg tataggtatc tgtatatctt tccttttgtt tacaactgttaaaaaacctc 4920 aaaatagttc tcttcaaaag aagagagatt ccaagcaacc catctttcttcagtatgtat 4980 gttctgtaca tacttatcgg agcgcgccag taagtatcag gcatatatatctgtctgtta 5040 gcaatgatta ttacatcatc agatcagcat gtgctatact ccctgcaagaaatatactga 5100 catgaacagg cagttcttgg agaagaaaga gcatttcttt aagtacctggggaatacagc 5160 tctcagtgat cagcagggag tttatttgag gacatcagtc acctttggggttgccatgta 5220 caatgagatt tataatcatg atactcttcg gtggtagttt caaaagacactactaatacg 5280 caggaagcgt tccagctatt taatgctggc aactactgtt taatggtcagttaaatctgt 5340 gataatggtt ggaagtgggt ggggttatga aattgtagat gtttttagaaaaacttgtga 5400 atgaaaatga atccaagtgt ttcatgtgaa gatgttgagc cattgctatcatgcattcct 5460 gtctcatggc agaaaatttt gaagattaaa aaataaaata atcaaaatgtttcctctttc 5520 taaaaaaaaa aaaaaaaaa 5539 42 3561 DNA Homo sapiens 42gcagtggaac gcgctgggcc gcgggcagcg tcgcctcacg cggagcagag ctgagctgaa 60gcgggacccg gagcccgagc agccgccgcc atggcaatca aatttctgga agtcatcaag 120cccttctgtg tcatcctgcc ggaaattcag aagccagaga ggaagattca gtttaaggag 180aaagtgctgt ggaccgctat caccctcttt atcttcttag tgtgctgcca gattcccctg 240tttgggatca tgtcttcaga ttcagctgac cctttctatt ggatgagagt gattctagcc 300tctaacagag gcacattgat ggagttaggg atctctccta ttgtcacgtc tggccttata 360atgcaactct tggctggcgc caagataatt gaagttggtg acaccccaaa agaccgagct 420ctcttcaacg gagcccaaaa gttatttggc atgatcatta ctatcggcca gtctatcgtg 480tatgtgatga cctggatgta tggggaccct tctgaaatgg gtgctggaat ttgcctgcta 540atcaccattc agctctttgt tgctggctta attgtcctac ttttggatga actcctgcaa 600aaaggatatg gccttggctc tggtatttct ctcttcattg caactaacat ctgtgaaacc 660atcgtatgga aggcattcag ccccactact gtcaacactg gccgaggaat ggaatttgaa 720ggtgctatca tcgcactttt ccatctgctg gccacacgca cagacaaggt ccgagccctt 780cgggaggcgt tctaccgcca gaatcttccc aacctcatga atctcatcgc caccatcttt 840gtctttgcag tggtcatcta tttccagggc ttccgagtgg acctgccaat caagtcggcc 900cgctaccgtg gccagtacaa cagctatccc atcaagctct tctatacgtc caacatcccc 960atcatcctgc agtctgccct ggtgtccaac ctgtatgtca tctcccagat gctgtctgtt 1020cgatttagtg gcaacttttt agtaaattta ctaggacagt gggccgatgt cagtggggga 1080ggacccgcac gttcttaccc agttggaggc ctttgttact atctttctcc tcctgagtcc 1140atgggcgcca tctttgagga tcccgtccat gcagttgtat acatagtgtt catgctgggc 1200tcctgtgcat tcttctccaa aacgtggatt gaggtctcag gttcctctgc caaagatgtt 1260gcaaagcagc tgaaggagca gcagatggtg atgagaggcc accgagagac ctccatggtc 1320catgaactca accggtacat ccccacagcc gcggcctttg gtgggctgtg catcggggcc 1380ctctcggtcc tggctgactt cctaggcgcc attgggtctg gaaccgggat cctgctcgca 1440gtcacaatca tctaccagta ctttgagatc ttcgttaagg agcaaagcga ggttggcagc 1500atgggggccc tgctcttctg agcccgtctc ccggacaggt tgaggaagct gctccagaag 1560cgcctcggaa ggggagctct catcatggcg cgtgctgctg cggcatatgg acttttaata 1620atgtttttga atttcgtatt ctttcattcc actgtgtaaa gtgctagaca ttttccaatt 1680taaaattttg ctttttatcc tggcactggc aaaaagaact gtgaaagtga aattttattc 1740agccgactgc cagagaagtg ggaatggtat aggattgtcc ccaagtgtcc atgtaacttt 1800tgttttaacc tttgcacctt ctcagtgctg tatgcggctg cagccgtctc acctgtttcc 1860ccacaaaggg aatttctcac tctggttgga agcacaaaca ctgaaatgtc tacgtttcat 1920tttggcagta gggtgtgaag ctgggagcag atcatgtatt tcccggagac gtgggacctt 1980gctggcatgt ctccttcaca atcaggcgtg ggaatatctg gcttaggact gtttctctct 2040aagacaccat tgttttccct tattttaaaa gtgatttttt taaggacaga acttcttcca 2100aaagagaggg atggctttcc cagaagacac tcctggccat ctgtggattt gtctgtgcac 2160ctattggctc ttctagctga ctcttctggt tgggcttaga gtctgcctgt ttctgctagc 2220tccgtgttta gtccacttgg gtcatcagct ctgccaagct gagcctggcc aagctaggtg 2280gacagaccct tgcagtgatg tccgtttgtc cagattctgc cagtcatcac tggacacgtc 2340tcctcgcagc tgccctagca aggggagaca ttgtggtagc tatcagacat ggacagaaac 2400tgacttagtg ctcacaagcc cctacacctt ctgggctgaa gatcacccag ctgtgttcag 2460aattttctta ctgtgcttag gactgcacgc aagtgagcag acaccaccga cttcctttct 2520gcgtcaccag tgtcgtcagc agagagagga cagcacaggc tcaaggttgg tagtgaagtc 2580aggttcgggg tgcatgggct gtggtggtgt tgatcagttg ctccagtgtt tgaaataaga 2640agactcatgt ttatgtctgg aataagttct gtttgtgctg acaggtggcc taggtcctgg 2700agatgagcac cctctctctg gcctttaggg agtcccctct taggacaggc actgcccagc 2760agcaagggca gcagagttgg gtgctaagat cctgaggagc tcgaggtttc gagctggctt 2820tagacattgg tgggaccaag gatgttttgc aggatgccct gatcctaaga agggggcctg 2880ggggtgcgtg cagcctgtcg gggagacccc actctgacag tgggcacacg gcagcctgca 2940aagcacaggg ccaccgccac agcccggcag aggggcacac tctggagacc ttgctggcag 3000tgctagccag gaaacagagt gaccaaggga caagaaggga cttgcctaaa gccacccagc 3060aactcagcag cagaaccaag atgggcccca ggctcctcca tatggcccag ggcttaccac 3120cctatcacac gtggccttgt ctagacccag tcctgagcag gggagaggct cttgagacct 3180gatgccctcc tacccacatg gttctcccac tgccctgtct gctctgctgc tacagagggg 3240cagggcctcc cccagcccac gcttaggaat gcttgcctct ggcaggcagg cagctgtacc 3300caagctggtg ggcagggggc tggaaggcac caggcctcag gaggagcccc atagtcccgc 3360ctgcagcctg taaccatcgg ctgggccctg caaggcccac actcacgccc tgtgggtgat 3420ggtcacggtg ggtgggtggg ggctgacccc agcttccagg ggactgtcac tgtggacgcc 3480aaaatggcat aactgagata aggtgaataa gtgacaaata aagccagttt tttacaaggt 3540aaaaaaaaaa aaaaaaaaaa a 3561 43 754 DNA Homo sapiens 43 ggagtatgagatgaaacgaa tggcagagaa tgagctgagc cggtcagtaa atgagtttct 60 gtccaagctgcaagatgacc tcaaggaggc aatgaatact atgatgtgta gccgatgcca 120 aggaaagcataggaggtttg aaatggaccg ggaacctaag agtgccagat actgtgctga 180 gtgtaataggctgcatcctg ctgaggaagg agacttttgg gcagagtcaa gcatgttggg 240 cctcaagatcacctactttg cactgatgga tggaaaggtg tatgacatca cagagtgggc 300 tggatgccagcgtgtaggta tctccccaga tacccacaga gtcccctatc acatctcatt 360 tggttctcggattccaggca ccagagggcg gcagagagcc accccagatg cccctcctgc 420 tgatcttcaggatttcttga gtcggatctt tcaagtaccc ccagggcaga tgccaatggg 480 aacttctttgcagctcctca gcctgcccct ggagccgctg cagcctctaa gcccaacagc 540 acagtacccaagggagaagc caaacctaag cggcggaaga aagtgaggag gcccttccaa 600 cgttgatgccccttctcttt cctcaaatca atgtcaggga gtcaaaaggg ctgtagcaca 660 ggatggagtttgatttatcc ctcctccccc aacacctagg aactgaatct ttttcttttt 720 attttttgagatggagtctt gctctgttgc ccag 754 44 1292 DNA Homo sapiens 44 tgagtttacgcagacgcaga aaacgcaggc aaacctgagg tcctcagaat ggcgggcaca 60 ggtttggtggctggagaggt tgtggtggat gcgctgccgt attttgatca aggttatgaa 120 gcccctggtgtgcgggaagc ggctgcagcg ctggtggagg aggaaactcg cagataccga 180 cctactaagaactacctgag ctacctgaca gccccggatt attctgcctt tgaaactgac 240 ataatgagaaatgaatttga aagactggct gctcgacaac caattgaatt gctcagtatg 300 aaacgatatgagcttccagc cccctcctct ggtcaaaaaa atgacattac tgcatggcaa 360 gaatgtgtaaacaattctat ggcccagtta gagcatcaag cagttagaat tgagaatctg 420 gaactaatgtcacagcatgg atgtaatgcc tggaaagtat acaatgaaaa tctagttcat 480 atgattgaacacgcacagaa ggaacttcag aagttaagaa aacatattca agatttaaac 540 tggcagagaaagaacatgca actcacagct ggatctaaat tgagagaaat ggagtcaaat 600 tgggtatccctggtcagtaa gaattatgag attgaacgga ctattgttca gctagaaaat 660 gaaatctatcaaattaagca gcaacatgga gaggcaaaca aagaaaacat ccggcaagac 720 ttctgaaaagacaatttagc aggtagaaga aaagttgggc tttcacaaaa ggcatctgaa 780 cttttaatgaactgtgaagg acaacagcat cttcccaaaa ccattgatgt ttaagtgttt 840 agaaatcatagaaggtgtag gctgctgtgg taattctatt tgtatatctc aacagaatta 900 aaatgtctagcttggtggta tttttatagc cataaaagaa aatctttagg ctttcaaaat 960 aaggatgactttagaataat attgtgtcat agaattaatt ttcagccatg tggaccatat 1020 tttgtatccaaggatcctta tttaaagctt tcaacatgta caggaagttg gaaatttttg 1080 gtttatgactttgtctaata aagagatagt tctaaacaca ttcttgatca ccaaacaact 1140 tcagaaagacagtgactgta cagttatcat tatcatcatc atcgtcatca tcataggtaa 1200 cagttatatcaagcttaata tgtgctgaac attgttctga acactttagg tagatgaact 1260 agatgtatgtgaataaaaat tatttggtcc tc 1292 45 2981 DNA Homo sapiens 45 ccccaaatctgcagatgtga atcccaagta ccagtgtgat ctggtgtcta aaaacgggaa 60 tgatgtatatcgctatccca gtccacttca tgctgtggct gtgcagagcc caatgtttct 120 cctttgtctgacgggcaacc ctctgaggga agaggacagg cttggaaacc atgccagtga 180 catttgcggtggatctgagc tagatgccgt caaaacagac agttccttac cgtccccaag 240 cagtctgtggtctgcttccc atccttcatc cagcaagaaa atggatggct acattctgag 300 cctggtccagaaaaaaacac accctgtaag gaccaacaaa ccaagaacca gcgtgaacgc 360 tgaccccacgaaagggcttc tgaggaacgg gagcgtttgt gtcagagccc cgggcggtgt 420 ctcacagggcaacagtgtga accttaagaa ttcgaaacag gcgtgtctgc cctctggcgg 480 gataccttctctgaacaatg ggacattctc cccaccgaag cagtggtcga aagaatcaaa 540 ggccgaacaagccgaaagca agagggtgcc cctgccagag ggctgcccct caggcgctgc 600 ctccgaccttcagagtaagc acctgccaaa aacggccaag ccagcctcgc aagaacatgt 660 tcggtgttccgccattggga caggggagtc ccctaaggaa agcgctcagc tctcaggggc 720 ctctccaaaagagagtccta gcagaggccc tgccccgccg caggagaaca aagttgtaca 780 gcccctgaaaaagatgtcac agaaaaacag cctgcagggc gtccccccgg ccactcctcc 840 cctgctgtctacagctttcc ccgtggaaga gaggcctgcc ttggatttca agagcgaggg 900 ctcttcccaaagcctggagg aagcgcacct ggtcaaggcc cagtttatcc cggggcagca 960 gcccagtgtcaggctccacc ggggccacag gaacatgggc gtcgtgaaga actccagcct 1020 gaagcaccgcggcccagccc tccaggggct ggagaacggc ttgcccaccg tcagggagaa 1080 aacgcgggccgggagcaaga agtgtcgctt cccagatgac ttggatacaa ataagaaact 1140 caagaaagcctcctccaagg ggaggaagag tgggggcggg cccgaggctg gtgttcccgg 1200 caggcccgcgggcgggggcc acagggcggg gagcagggcg catggccacg gacgggaggc 1260 ggtggtggccaaacctaagc acaagcgaac tgactaccgg cggtggaagt cctcggccga 1320 gatttcctacgaagaggccc tgaggagggc ccggcgcggt cgccgggaga atgtggggct 1380 gtaccccgcgcctgtgcctc tgccctacgc cagcccctac gcctacgtgg ctagcgactc 1440 cgagtactcggccgagtgcg agtccctgtt ccactccacc gtggtggaca ccagtgagga 1500 cgagcagagcaattacacca ccaactgctt cggggacagc gagtcgagtg tgagcgaggg 1560 cgagttcgtgggggagagca caaccaccag cgactctgaa gaaagcgggg gcttaatttg 1620 gtcccagtttgtccagactc tgcccattca aacggtaacg gccccagacc ttcacaacca 1680 ccccgcaaaaacctttgtca aaattaaggc ctcacataac ctcaagaaga agatcctccg 1740 ctttcggtctggctctttga aactgatgac gacggtttga gtgacatcat tggtgtagaa 1800 agtttgtgtgtttttttttc ttctccctag ttgccaaaat taaaaaggtg gtgttttcat 1860 ttttgtataatactttaatg gaatgctttt taaaaaaata taaaaccaag gtaaattatt 1920 gtttcatcttcacgtatgga tgctagtgcc tttaatggaa ggtaaagaat gttttgctag 1980 ttagaagtacatattgaggt tttaatggtg gtgatagtga gttttgtggc accagctgtt 2040 ttttattttaaactttctga gcatccggca aggtacaggt tttgatgttc aagttttatt 2100 gggataagatcttttgatcc caaggtcagg tggatggaat ttttggattt atatttgttc 2160 cttgagtcttcagggcagtg tctccatgag ggttttcctg ttgaggggca ccacatacaa 2220 tagtgtgaagtaggtatgag gggcagtcat tgtattctat agttttttta tgtagtctac 2280 atttctcagatgtatcccca ttcggtttta ttctcagaac tgttactaga ctcatgactt 2340 ggaggccaaaccttaaatcc agagatagca gcctcgatag ggaccttaaa aggattcaca 2400 aaaacttttgccacacttgg tgcctaggcc ctgttcctaa taaccccttc tagggccgtt 2460 tatccaacatttagatgcct tcttttccct ccctaatttg tagccagtcc aacctttcat 2520 tccttggaggatttagtttt gggataaaat tttggtcctt gggcacagag acattcacta 2580 ttaatgaagtaacccttggg catgactcca atcccagaat tgctcactga gcgctatgcc 2640 accgaagcgttgacctgaac atattagtgc aatccagtcc agattggacc tttgatccta 2700 tgtggaagggctgtttttta agaaaaaatt tttggtaaac agtattgtgt aaaattgctt 2760 tttgtataccaatatatgca tgttttgtgc atgagtagta cttgtgttga tactcctgtt 2820 gatgttaaattactatataa tataaacagt atgtgttttt atatatcatt gtgtaaattt 2880 aatataacatatgcagtaat aaaccatttg ttttactgct gttaagtttg ttatttgggt 2940 ataaaaccagatgtttacac ctgtaaaaaa aaaaaaaaaa a 2981 46 4226 DNA Homo sapiens 46ggccatgggg cgcgtcgtcg cggagctcgt ctcctcgctg ctggggttgt ggctgttgct 60gtgcagctgc ggatgccccg agggcgccga gctgcgtgct ccgccagata aaatcgcgat 120tattggagcc ggaattggtg gcacttcagc agcctattac ctgcggcaga aatttgggaa 180agatgtgaag atagacctgt ttgaaagaga agaggtcggg ggccgcctgg ctaccatgat 240ggtgcagggg caagaatacg aggcaggagg ttctgtcatc catcctttaa atctgcacat 300gaaacgtttt gtcaaagacc tgggtctctc tgctgttcag gcctctggtg gcctactggg 360gatatataat ggagagactc tggtatttga ggagagcaac tggttcataa ttaacgtgat 420taaattagtt tggcgctatg gatttcaatt cctccgtatg cacatgtggg tagaggacgt 480gttagacaag ttcatgagga tctaccgcta ccagtctcat gactatgcct tcagtagtgt 540cgaaaaatta cttcatgctc taggaggaga tgacttcctt ggaatgctta atcgaacact 600tcttgaaacc ttgcaaaagg ccggcttttc tgagaagttc ctcaatgaaa tgattgctcc 660tgttatgagg gtcaattatg gccaaagcac ggacatcaat gcctttgtgg gggcggtgtc 720actgtcctgt tctgattctg gcctttgggc agtagaaggt ggcaataaac ttgtttgctc 780agggcttctg caggcatcca aaagcaatct tatatctggc tcagtaatgt acatcgagga 840gaaaacaaag accaagtaca caggaaatcc aacaaagatg tatgaagtgg tctaccaaat 900tggaactgag actcgttcag acttctatga catcgtcttg gtggccactc cgttgaatcg 960aaaaatgtcg aatattactt ttctcaactt tgatcctcca attgaggaat tccatcaata 1020ttatcaacat atagtgacaa ctttagttaa gggggaattg aatacatcta tctttagctc 1080tagacccata gataaatttg gccttaatac agttttaacc actgataatt cagatttgtt 1140cattaacagt attgggattg tgccctctgt gagagaaaag gaagatcctg agccatcaac 1200agatggaaca tatgtttgga agatcttttc ccaagaaact cttactaaag cacaaatttt 1260aaagctcttt ctgtcctatg attatgctgt gaagaagcca tggcttgcat atcctcacta 1320taagcccccg gagaaatgcc cctctatcat tctccatgat cgactttatt acctcaatgg 1380catagagtgt gcagcaagtg ccatggagat gagtgccatt gcagcccaca acgctgcact 1440ccttgcctat caccgctgga acgggcacac agacatgatt gatcaggatg gcttatatga 1500gaaacttaaa actgaactat gaagtgacac actccttttt cccctcctag ttccaaatga 1560ctatcagtgg caaaaaagaa caaaatctga gcagagatga ttttgaacca gatattttgc 1620cattatcatt gtttaataaa agtaatccct gctggtcata ggaaaacaca cggttctaat 1680taagtgtgaa ggtatagcta ttgcacttat gccatctcca aaatttctta agtattcttt 1740cactatccat taggagtttt tcttaaactt gtctgataat aagaatcacc tggagttagg 1800aggtggtggt tgcagtgagc caatctcacc attgcacttc agcctgagca acacgagcaa 1860aactccgtct caaaagtaaa taaaaataat cacctggagt ttgttaaacc atatggattc 1920tcaggctcct ctcttgaaga ttctgattca gtaggtctgg gagtggcgcc ctggattttg 1980atcaaaattg tagagcattt taaggtgagt acctgaggga gaacttaaag acatcttagt 2040tggggagtag tccttttgaa ttttacagct agatataatc ttcagtcaga taaaatttat 2100gggagctggt gtcttatgcc tgactcttag taatttcata ccggtttgaa gtacgtgtgc 2160ccatgcctaa agccttgact ttcagaatgt tgtcttttga ttcttctgtc ttgatttgat 2220taggggtgaa atttagaagt cttagtaatg taacttgaag atgttaaaca aaaatctcaa 2280gtaaaatgaa aagcaaatat gggctactga attaagaaac tggcattcta gtattaaatc 2340ctcacttcag gagcttttaa aaatactgag acccccccat aaccagagat tcagattcaa 2400agactgagga taggacctta gcattgtagc tatttaaagt ttctaatgtg cacccagggt 2460tgggaatcac caatgtgggt gtgaaaatgc ctacaaaggg ttttagtgcc ttagaagtcc 2520taagaagccc aatctgtatc aaagcagatc cattttgcaa ggatctttct tttagaactt 2580tctcagttct cttagtaaga actttagaag taatcttgat aataagcaca gacagcctaa 2640cagcagaggc aacttaaata actcctgagc agttggcact agaacagaat acttggaatg 2700acaccaaagt taaccaagtc cagcatatgt ccaaagagtt aagtgtttca tttactgtag 2760cattctgggt gagaaattgg ttgctgaaat cttaagacag tggtctcaac cttggctgca 2820cattggaatc acctgtaggg ttttaaagca tccaaatggt aattaacagg cagcaaaact 2880tcagaactag ttctgcatct actgtgcaaa gatcatgatt aactgtcaag acactggtag 2940aacagaacaa gcaaaagatt aagagttcaa aagtaaatgc aaccaattta acatgtagtg 3000ttattaaaaa attacaaagg cctagaccag cctgggcaac atggtgaaac cccatctcta 3060caaaaaattt ttaaaaagtt ggccaggcat ggtaatgcgc gcctgtggtc ccagctgctc 3120gggaggctga ggtgggagga tcacttgagc cttggaggtc aaggctgcag tgaatcatga 3180tcatgccact gtactccagc ctgggcaaca gagaccatct taaaaaaaaa aaatccttcc 3240attgaataga aacttcaaag tccttcatca gatcatggca ttagctactt tagcaaaata 3300gtgtaaactt tggttctgag taaaaatgac ttccctacca cagttgtgaa tgttaatatg 3360ctgataaaac tgccaagata tatttgatag gtaggggatt aaactcatgg tttgtcaaaa 3420gagtgttttt ttctagtttt attcttaaca gatatgttga ggtattcata tttgtttcct 3480tttgtggttt taatgaagac aatttgtaaa gtaatactgt tatgtatatg ctaaatgttg 3540gtaaatacta attaatttcc atcatttgta gacttgtttt gcaatgggat atatttttac 3600ttataatacc aatttaggct gggcgcagtg gctcacgcct gtaatcccag cactttggga 3660ggccaaggca aacggatcac gaggtcagga gatcaagacc atcctggcca acatggtgaa 3720accccatctc tactgaatac acaaaaattg gctggacatg gtggcgcatg tctgtaatcc 3780cagctactgt aatctcagct actcgggagg ctgaggcagg agaattgcct caaccgggag 3840gcggaggttg cagtgagcca agatcgcacc actgcactcc agcctggcaa cagagcgaga 3900ctctgtctca gaaaaaaaaa aacaaaaacc agtttaggct gggtatggtg gctcaccagc 3960actttgggag gctgaggcag gtggatctct tgaggtcggg agtttgagac cagcctggcc 4020aacatggtga aaccctgtct ttactaaaaa tacaaaaatt agccaggcat ggtggcgtgt 4080gccagtaatc ccagctactt gggaggctga ggcgcgagaa tcgcttgaac cctggagatg 4140gaggttgcag tgagcagaga tcgcaccact gcactccagc ctgggcaaga gtgagactgt 4200gtctcaaaag gaaaaaaaaa aaaaaa 4226 47 8467 DNA Homo sapiens 47 ttccccaaattgatggacat aaacccatat gcttatctca gcatgtgttt aaaaagcact 60 tgctgagattcagtgaccat ccaacattaa aaactgctga tagaaggaaa ctcacttagc 120 tgaattaaggacgtgttctt aaaatctacc gccaacgtaa tggggaggag ctcacggcgt 180 tcgttaatttattcattccg caaatatgtt ttgggcagtt acataccgaa tagtagatgg 240 aggtgtgcctgctgtcatgg agatagggtg atttcatcct gttgatcagg aaaactccta 300 ggtgcttgcaggtaaatgtg ccacaaagaa agtgaggacc aaaggttagt tgatgtaaaa 360 acaagtttgaaatgcatttt ggggtaattt atccggtcgc ttcgggcatt cctcgcggaa 420 ggcgtggtctggtgactcag aagccaacac actgcgggag tccagccgtc ggccccctgc 480 cgtgtggcgaggcccagtgt gtcccctttg taaggacagc acaagcagga gttaatggac 540 cggccatccatagcggtggt ggggcaggga gccagtttcc gaaagaaact cacgccgccg 600 caggagggccctgtgggatg ctctgtgcag agctgttgtg cggaccggga gacgggaaag 660 cctggtggctgcaggagggc accgtgcaga agtatccagt aaaccaccca cagcacggca 720 gcagaaaaacgaggaaatta tatgtgtgta tgtttataag aactcagaag caatggtgag 780 caaaaagcaaaagcaagagg agaaagtcac agtgtgctgg catcaagttg cattgagagg 840 agcccgcggggtagtgcacc cgcatttcct cgttgcgttg agaggcgccc gcggggtagt 900 gcacccgcatttcctcgttt gagaggcgcc cgcggggtag tgcacccgca tttcctagtt 960 gccttgagaggtgccgcggg gtagtgcacc cgcatttcct agttgccttg agaggtgccc 1020 gcggggtagtgcacccgcat ttcctagttg ccttgagagg tgccgcgggg tagtgcaccc 1080 gcatttcctcgtttcattga gcggtgcccg cggggtagtg cactcgcatt tcctagttgc 1140 cttgagaggtgccgcggggt agtgcaccca catttcttca ctcgtttaga gttcggggct 1200 ctcagaacacagggagaata tgggagaatt ccttactaga tgttacagag gccacacagg 1260 gccacttttttctttttttt ttattgtgtc aggtatacgt aaatattcct ttcggtcagt 1320 tcagcacgtaggactgagtg gcattaggta cgctcactgt acagccaatg cctccatagg 1380 ccactttttaaatacgcggg gctaagggcc aacgacaaga ttgtcatcga ggacaataag 1440 tcgatggcgctgcctggtca ctggcttggt cagaaaacac atgccagggt gactggattt 1500 aacgttcagttttagaacca caaatctgcc agcccaggca ttcaagagga agtgagtaat 1560 tcactgaattgatggttagc aagacccttc aaagtccttg gaagttccgt gtttgctggg 1620 ggtcacaacagcaattgcgt ttctaaaaca ttgaaaacca cccgtttttc acacatctga 1680 atagcctgagttgtaacaga cctaagtaaa ggcgtccaaa cgtgcctgat cctgtggctg 1740 ggtcccaggagccttaacaa ggcattgaga gagctggatt gattgattag actcttgcca 1800 actgctgtgcggaatacaga aaatgcagct ccagctctca aagggctgaa aatctaattg 1860 aggtgaaaaaggtaacatgt gtgaaaacca tgtctgtgta gacttgtttt gtatggacca 1920 gaaaagttggaagtccagag atggatgcga ggaagtaggg gatggagttt ccttataacc 1980 tctggagggacagtccagat acataccggg gtgtgtagcg ggtggaggtt ctaagacagg 2040 ctggaggaacagtgcagaca cacaccaggg tgtgtagggg gtagaggttc taagacagtc 2100 tggagggagagtgcagacac acaccggggt gtgtaagcag tggaggttct aagacaggct 2160 ggaggaacagtggagacaca caccagggcg tgtagggggt agaggttcta agacaggctg 2220 gaggaacagtggagacacac accacggcgt gtagggggta gaggttctaa gacagtctgg 2280 tgggagagtgcagacacacc cggggccgtg tagggggtac aggttctaag acagtctggt 2340 gggagagtgcagacacacac ggggctgtgt cggggataga ggttctaaga cagtctggtg 2400 ggagagtgcagacacacacc agggcgtgta gggggtggag gttctaagac aggctggagg 2460 aacagtggagacacacactg gggagcatag gggtgctttt ctctgagtcc cctagtacat 2520 ggtagaggctgtagaccctc cgctcttggg cacgtgggta ggctctcagg atgactcttg 2580 gctcttgggcatgtgggtga gctctcagga tgatgcccag ccccaatttt caggcaattg 2640 tgcaaggacttgacccattc atccgctgag ctgagttgct gagactgctg ggtgcccggg 2700 tggtgatcattgtccgtggc atacagaaca cactgcagct tctgcaaagt gagctcattt 2760 cacgcattttatggctttgc caggctgctg ttgacctgcc agaactttta atcagacatt 2820 tggaggacctgttttgtagt cagtggagaa atattacaag gatagggtaa tttgaaatat 2880 ctaaggattgtaagtgacaa gttcatgtct aattttgcat ttccagtgaa agcaagtgtt 2940 ggctttgaatgttacttatg tgctgagatg tgtatattcc tcagtgctta attactaagg 3000 atttttagggccaagttttg ttacagtgaa tgattgtgga tgcataaaga ataaatttaa 3060 tatttttaaggcatggagat tatttgtatc taagaaacca ggtaaaataa agaaacattt 3120 atgcttgtgtgactgataaa agagttagag agacactcat attctgggag tttgaagaat 3180 gtcattttcattctctaaaa gtcttgttag tgtcacagca ttgaaaattt aaaaatccgt 3240 gtgtattttcttgctagtgc tggtacttga atatctgtat catccaccta tccatccacc 3300 tacccatatttctataatcc accgtccctc gacatgccta tcatctgtcc accatttctc 3360 tctgtctaattttcaaaaca tcctgtaagt ttatataaag gaagattttt cttcttgtga 3420 agttctctaaggctgacaag ttacctggca tgactgtggc ggatgcccat agccaggtgg 3480 tcctcggggtacagatgggg caggggcact tgtgagaaac acctgaagtg cttttcccca 3540 gcctccccggccctgccggg tggtggaggc gctgcacggt gccttccatg gagcaagccc 3600 ggggctccgcagggtcctca gcatgattca gatttccttc cacccccagc tctagatgat 3660 ttggtaaaaccacaaacagg cacaaaacag cccacatgga attctaaagt tttaatttca 3720 ttttggaatttatgcactca gatgaaatga tttatgatga tgttgagaat ggggatgaag 3780 gtggaaacagctccttggaa tacggatgga gttcgagtga atttgaaagt tacgaagagc 3840 agagtgactcggagtgcaag aatgggattc ccaggtcctt cctgcgcagc aaccacaaaa 3900 agcaactttctcatgaccta acccgtttaa aggagcacta tgagaaaaag atgagagatt 3960 tgatggcaagcacggtgggc gtggtggaga ttcagcagct caggcagaag catgaactga 4020 agatgcagaagctcgtgaag gccgcgaagg acggcaccaa ggacgggctg gagaggacca 4080 gggcagccgtgaagaggggc cgctccttca tcaggaccaa gtctctcatc gcacaggatc 4140 acagatcttctcttgaggaa gaacagaatt tgttcattga tgttgactgc aagcacccgg 4200 aagccatcttgaccccgatg cccgagggtt tatctcagca gcaggttgta agaagatata 4260 tactgggttcagttgtcgac agtgaaaaga actacgtaga tgctcttaag aggattttgg 4320 agcaatatgagaagccgctg tctgagatgg agccaaaggt tctgagtgag aggaagctga 4380 agacggtgttctaccgagtc aaagagatcc tgcagtgcca ctcgctattt cagatcgcgc 4440 tggccagccgcgtttccgag tgggactccg tggaaatgat aggcgatgtc ttcgtggctt 4500 cgttttctaagtccatggtg ctggatgcat acagtgaata tgtgaacaat ttcagcacag 4560 ccgtggcagtcctcaagaaa acatgtgcca caaagcccgc ttttcttgaa tttttaaagc 4620 aggaacaggaggccagcccc gatcgaacca cgctctacag cctgatgatg aagcccatcc 4680 agaggttcccacagttcatc ctcctgctcc aggacatgct gaagaacacc tccaaaggcc 4740 accccgacaggctgcctctt cagatggccc tgacagagct cgaaacacta gcagagaagt 4800 taaatgaaagaaagagagat gctgatcaac gctgtgaagt gaagcaaata gccaaagcca 4860 taaacgaaagatacctgaac aagcttctca gcagtggaag ccgatacctc attcgatcag 4920 atgatatgatagaaacagtt tacaacgaca gaggagagat tgttaaaacc aaagaacgcc 4980 gagtcttcatgttaaatgat gtgttaatgt gtgccaccgt cagctcacgc ccctctcatg 5040 acagccgtgtgatgagcagc cagaggtact tgctgaagtg gagcgttcca ctgggacatg 5100 tggacgccatcgagtatggc agcagcgcag gcacgggcga gcacagcagg caccttgccg 5160 ttcacccgccggagagcctg gccgtggttg ctaacgcgaa accaaacaaa gtttacatgg 5220 ggccaggacaactgtatcaa gatttacaaa acttgttgca tgacttaaat gtaattggcc 5280 aaatcactcagctgatagga aaccttaaag gaaactatca gaacttaaac cagtcagtag 5340 cccatgactggacatcaggt ttacaaaggc ttattttgaa gaaagaagat gaaatcagag 5400 ctgcggactgctgcagaatt cagttacagc ttcccgggaa gcaggacaaa tctgggcgac 5460 cgacgttctttacagctgtg ttcaatacgt tcacccctgc catcaaggag tcctgggtca 5520 acagcttacagatggccaag ctcgccctag aagaggagaa ccacatgggc tggttctgtg 5580 tggaagacgatgggaatcac attaaaaagg agaagcatcc tctcctcgtc ggacacatgc 5640 ccgtgatggtggccaagcag caggagttca agattgaatg tgctgcttat aaccctgaac 5700 cttacctaaataatgaaagc cagccagatt cattttccac ggcacatggt ttcctgtgga 5760 tcggaagttgcacccatcaa atgggtcaga ttgccatcgt ctcgtttcaa aattccactc 5820 ccaaagtcattgagtgcttc aacgtggaat ctcgcatcct gtgcatgctg tacgttcccg 5880 tcgaggagaagcgcagagag cctggggcac ccccggaccc cgagaccccg gccgtgagag 5940 cttctgatgtccccacgatc tgtgtaggga cggaggaggg aagcatttcc atttataaaa 6000 gcagtcaaggctccaagaaa gtgagacttc agcacttttt cactcctgag aagtccacag 6060 tcatgagcctggcttgcacg tctcagagcc tgtacgctgg cctggtcaac ggggcagtcg 6120 ccagctacgccagagcccca gatggatcct gggattcaga acctcaaaaa gtgatcaagt 6180 taggcgtcctaccagttaga agtctactca tgatggaaga cacgttgtgg gcggcttccg 6240 gaggtcaagtcttcatcatc agtgtggaga ctcatgctgt agagggtcag ctggaggccc 6300 accaggaggaaggcatggtg atctcccaca tggccgtgtc cggcgtcggg atctggattg 6360 ccttcacctcagggtccacg ctccgccttt ttcacacgga aactctcaag cacctgcagg 6420 acatcaacatcgccacccct gttcacaaca tgctgccagg gcaccagcgg ctgtcggtga 6480 cgagcctgctcgtctgccac ggattgctga tggtcggcac cagcctggga gtcctcgtgg 6540 ccctgccggtcccacgtctg caagggattc ccaaagtgac cggaagaggc atggtctcct 6600 accatgcacacaacagtcct gtcaaattca tcgtcctggc cacggctctg cacgagaaag 6660 acaaggacaaatccagggac agcctggctc ctggccccga gcctcaggac gaagaccaga 6720 aggacgcacttccgagtgga ggagctggtt catctctgag ccagggtgac cctgacgcag 6780 ccatctggttgggagattcg ctgggatcga tgactcagaa aagcgacctg tcctcctcat 6840 ctgggtccctgagcttgtct cacggctcca gctctctaga gcacagatca gaggacagca 6900 ccatctatgatctcctgaag gatcctgtct cgctgagaag caaagcacgc cgggccaaga 6960 aagccaaggccagctcggcg ctggtggtct gtggagggca gggccaccgc cgggtgcaca 7020 ggaaggcccggcagccccac caggaagagc tggcgccgac cgtcatggtc tggcagatcc 7080 ctctgctgaatatataagca ggacggccgc cttctgctgt cagaatttgc aatcaagggt 7140 gacttctcagctaatcctac agcctgagtg gttaagctgt gtctacactg gttgggaata 7200 aattaaaaacagtatttggg ggagaaacgt gcaatagcgt aatggtggtg tccctgccaa 7260 ttccttccttctcttctgta cagcagaagt aattacaagc acttctcacg aaggcagaag 7320 actgatgcaattttcgagta attgagtgca gttctgggaa aataccacat tctttttgac 7380 tgctgtagtccatatatgaa tactaaatgt taaacttcat cagcgtcaga cctattgtat 7440 catattagagaatttgcaga ctaagaattt atgagaaaat atatgtattc agtagtgcag 7500 gcatttattaacaattctta aaagttttac ctgattcaga ttcacgactt ttatttatat 7560 tctatatttttgaatttcag agtaaaattt gttaacaatt ttaaaagcca ggtaacacct 7620 accagtccagttagcatgat ttgctttcag aagtgagctg ggttttccaa agtggtataa 7680 tgtgtgtactgtatatttta acaaagtaat atttttgtat tgcatttttc tattaaaaaa 7740 ttaacagttaatgtttcagt caatgtatta tctgtagcat ttcacaaata atgtttgctt 7800 tgaaccaaaatgctcagtgc ctatcaacat ttggactcaa gcatcaacac caaattattc 7860 ctcccttctcgtataaatag agtgactatc cacaggagaa aagtgtgtgc tttagtatta 7920 gaggagataggcagagaagt cttgcttagt tccttcgtgc agcttcttgc ccctgttgac 7980 gtggaatgctgtgtctgctt tagcacgcac gctccgaatg actcctggtg ctaggccatg 8040 ctggctgctgtcactgagcg ggactcaggc caagaggcgt gacctcgggc cagcctgtct 8100 gttgtgcagacgcctcctct gcagaacgca tcagtttcta ttctgcagtt gcagagccag 8160 ccccgcgtgagaacgtgcat aatgagtgca caccatcatg tcaaggtgca tacttagtga 8220 gcgccatcctgctgaacgtg tatttcagtg tttcacttac tggacggata acaagaaaaa 8280 aatcctaacacaggcagtca ccagaaataa atgtctcagc actttacaga tgactaaaaa 8340 tgttaattttatgacttagc caaatatgtt ctaggttgca tatatccccc atgtgaaagt 8400 gatttcttcccaagcttctc aaactgttag ctgctgtctg acttcatcaa taaagtattt 8460 ttatttt 846748 4639 DNA Homo sapiens 48 gaaaagttgt cagtccttac tgttcaggac gttggtcaggtgatgcctgg agctaatgta 60 tgtgttgtga agttagaagg taccccttat ctttgtaaaactgatgaagt gggagaaata 120 tgcgtcagtt ccagtgcaac tggcacagcg tactatggattgcttggaat cacgaagaat 180 gtgtttgagg cagttccggt caccacagga ggagcacccatctttgacag gccattcacc 240 aggacaggcc tgctgggctt catcgggcct gacaacctggtcttcatcgt gggcaaactg 300 gacgggctga tggtcactgg agttcgcaga cacaatgcagatgacgttgt ggccaccgca 360 ctggccgtgg agcccatgaa gtttgtctac agaggcaggatcgctgtgtt ctctgtgacc 420 gtgctgcacg acgaccggat tgtcctggtg gctgagcagcggccggatgc ctcggaggag 480 gacagcttcc agtggatgag ccgtgtgctg caggccattgatagcatcca ccaggtgggc 540 gtgtactgtc tggccctggt tcctgccaac accttgcccaaggctcctct cggagggatt 600 cacatttctg aaaccaaaca gcgctttctg gaagggacgctgcacccgtg taatgtgctg 660 atgtgccctc acacctgtgt taccaacctc cccaaacctcgtcagaaaca accagaggtt 720 ggaccagcct caatgatcgt ggggaacctg gttgctgggaagagaatcgc tcaggcttcc 780 gggagagagc tcgcccacct ggaggacagc gaccaggcacggaagttcct gttcctggct 840 gacgtgctgc agtggcgtgc ccacaccact cctgaccacccgctgttctt gctgctgaac 900 gccaagggca ccgtcacaag cactgcaacc tgtgtccagctgcacaaaag ggctgagaga 960 gtggccgcgg ctctgatgga gaagggaaga ctgagtgttggggaccatgt ggctctggtc 1020 tacccaccag gggtggacct cattgccgcg ttctatggctgcttgtactg tggctgcgtg 1080 cctgtcaccg tgcggccccc gcaccctcag aacctcggcaccacactgcc caccgtcaag 1140 atgatcgtgg aggtcagcaa gtctgcatgc gtcctcaccacgcaggctgt cacacggctg 1200 ctcaggtcca aggaggctgc tgctgccgtg gacatcaggacctggcccac catcctagac 1260 acagatgaca tcccaaaaaa gaagatagca agcgttttcaggcccccctc ccccgatgtc 1320 ctcgcatact tggacttcag cgtgtcaacc actgggatattagcgggagt gaagatgtcg 1380 cacgcggcca caagcgcctt atgccgctcc ataaagctgcagtgtgagct gtacccctcg 1440 cggcagatcg ccatctgcct cgacccctac tgtggccttggttttgccct gtggtgtctg 1500 tgcagtgtct actcgggaca ccaatcagtg ctggtgcccccgctggagct ggagagcaac 1560 gtgtccctgt ggctgtcggc cgtcagccag tacaaggcccgcgtcacctt ctgctcctac 1620 tctgtgatgg agatgtgcac caagggccta ggcgcacagacgggtgtcct caggatgaag 1680 ggggtgaacc tgtcatgtgt gcgcacgtgc atggtggtcgccgaggagcg gcccaggatt 1740 gcgctgaccc agtccttctc caagctcttc aaggacctgggcctgccggc ccgcgccgta 1800 agcaccacgt tcgggtgcag ggtcaacgtg gccatctgcctccagggcac agctggcccg 1860 gaccccacaa ccgtctacgt ggacatgcgg gcactgcgccatgacagggt tcgtttggta 1920 gaacggggtt ctccgcacag cctgccattg atggagtctggaaagatcct ccccggcgtg 1980 aaggtcatca tcgcacacac cgagaccaaa ggacccttgggagactcaca cctgggagag 2040 atctgggtaa gcagccccca caatgccacc gggtactacaccgtttacgg ggaggaggcg 2100 cttcatgctg accacttcag tgcccggctg agttttggagacacacagac catctgggca 2160 aggaccggct accttggctt ccttcggcga acagagctcactgatgccag tggagggcgg 2220 cacgatgcac tgtatgtggt tgggtctctg gatgaaactctggagctcag aggcatgcgg 2280 taccacccca tcgacattga gacctctgtc atccgagcacacaggagcat cgctgagtgt 2340 gccgtattca cctggaccaa cctgctggtg gtggtggtggagctggatgg gctagagcag 2400 gatgccctgg acctggtggc cctggtgacc aacgtggtgctggaggagca ctacctggtc 2460 gtgggagtgg tggtcatcgt ggacccaggg gtgatccctatcaactctcg gggtgagaag 2520 cagcgcatgc acctgcggga cggcttcctg gctgaccagctggaccccat ctatgtcgcc 2580 tacaacatgt gagcgcagca caccggccca ggtgccggagatgaatgagc cccagcagtc 2640 caaggtgtga tgtgggaaga caccgcagag ctcactcaccgggactcgcc cttcctgtgc 2700 tcttacagat ccctctcaac aatccccgca tctccttttagaaagcactt cctgaattat 2760 ttaaagaaat attttgaatc tgccaagtac atttacaaaaacacggatgc tggtatttta 2820 acagatggag agacaaggaa aggaaaggaa aggcctggcatgggcattgt gaggaatcac 2880 aggcaccgag gttgttctct gctgtactgc aagtttgcactttctttagg ctaaaaatat 2940 agttcctgat ttttaaaatt cagttattta ttcccacttcaaatgacaag ttcatatata 3000 gaatttacgg gagaaacttg agaccattta cggggagacatttgattctg ggaagatagc 3060 agagtacaaa ccagctgcct cacttctgtt tcacagggaggctgatggaa aaaggaagca 3120 agctggaccc atcctccctg ctcacagagg gcactgtggtcacacacagt gcctcctctg 3180 ccagttcctt tattgaaaga ggtgtgctgg ctggccacggtggcttacac ctgtaatccc 3240 agcactttgg gaggccgagg cgagcagatc acgatggtctccatctcctt acctcgcaat 3300 ccgcccgcct cagcctccca aagtgttggg attacaggcatgagccaccg tgcccggcct 3360 atagaaatca atctttttga ctcttctcac ttttatctccccatgcccaa ggtttgcctg 3420 ttccataaca ctcactccct tcccccttgc taatcagaagccatctcctc tcagtgtctg 3480 atctctgctc ttcatacatg attacagtca tggggtagagagtgcttgct aaattatgca 3540 gttaatccta tggtgcttta attttcaggc cttcaaaaaacacttgtaca gtgatgtgca 3600 gatttttaaa cagttgaact tccttgtact acagtttttgtattgacagc caaatttgtc 3660 tttcattctt cagattgtga ataaagtgat ttttacagggcttccagcaa agtttttcct 3720 ttcatctaag gcttgtagaa ccctagctta tatagctgcttacatgagaa atgcaaaatc 3780 tgtattcacc atgactttag taacaaaggt aaagttttttagtagtgcca aggcaagagg 3840 aacaatcttg gtggtagtac taagttttgt caatattgtggtttcctgat tgtattgttg 3900 gctttctctc tgagcattga ggtatactag aagtagagcttctcaaacat aatatcatta 3960 cctcataagt attaacaaat caggcccaaa gagcgtaagtcctagaaatt tgttttaaag 4020 cagccctagt catggtgctg gtgctaccgc cttgttttaggagcctgcct cctgtcagta 4080 tgaaaccctc acctgaaaaa tgccagcctg gacaccaaacactgagcccc ttcaacaggc 4140 acattatttc cccctgagat ccataaggga atttagtttctactattgta gagttctgaa 4200 aagaggtaaa atagtagtcc tttggtcatc ctatttttgctttcaatttt gatatttcag 4260 actgtaaaag gccttggggg atgatagtac atgtggtagcagtaattttt ttgaagcaac 4320 tgcactgaca ttcatttgag ttttctctca ttatcagattctgttccaaa caagtattct 4380 gtagatccaa atggattacc agtgtgctac agacttcttattatagaaca gcattctatt 4440 ctacatcaaa aatagtttgt gtaagttagt tttggttaccatctaaaata tttttaaatg 4500 ttctttacat aaaaatttat gttgtgtttt aaaatccttaggggctttat ctatttttct 4560 aagtcagtta actgtacttc taaaaaaagt attttgtatctacttttgta acttcgtcag 4620 aataaaatat attgaaagc 4639 49 1769 DNA Homosapiens 49 gctttcaccc attagcatta cttacgtaga taattcttta tgcctagttattatacatat 60 taatttttaa ggtatacatt taaattacac aattgttcat tgtggtttgtatcccagaat 120 gtgttgtgtt ttttaaaaga tgcataatag ctgaatgtat gcatgactttgaaagaagtt 180 aaaatggtga ttttttttca cctcttgtac attttaaaac caggccaaatctatttgcca 240 agcagtgtat cactaataag aaaagcagtt tttcctttta ttgcagtttttgtttatctg 300 ccatagaatt tccttatact gtggcttggt attattcaag attagctatttcgctggtat 360 tacatctttt taaaagccta ttataacatg gttagcctat aaggcagtgttggtcccctt 420 ctaatattgg cctcataaag gggttccact gtactttccg catattactgtgttgttgtt 480 ttcctttgtg gatatataag caaattgagc ttgggtgatt tttatggagacaataattag 540 acaatactgt ataattagtt ttacttaata gattatcatc ttgtgagaagagatgtttaa 600 acgtggtaaa tcacttcata ttacaaaaca gttttacact taatatgttaacattgggtg 660 caataattta gtagcattag ctttagttac aaatataact ggatctttctgctgacaact 720 taggttgtat gagttatgct taaaagcttt aaatctgatg tttcctgtacctgccacact 780 atgttagaat gtgtccttca aacatatcct cctgcaactt ctcaaactgtactaaattga 840 tatttcttga agtctaactc tgtgctaaca gatctccatt ttaaatagaatacggtttta 900 atttttgata agctgctgaa ttttaaagag agttttttgg ggccaccaaatattttggat 960 catgcagaga atatatattg tactgtagta attttgtatt tacatttgtatgatgtgaca 1020 taatagatgt gaatgttaat cactgcttga ctatgttaat aaagttgtttaactataaaa 1080 aaaaaaaaaa acccacgcgt ccttcagatc aatccatcta tgcaaatttatggggaaaaa 1140 ttgtttttta aattaaattt ccaataccca agccctaaaa ttgatggatgtgaccccagg 1200 tgttcccctt acctcttggc cccccaaaac agggacagac atagatggtgggctggaaca 1260 cccctcacct cctgtattcc cagaaagcct cgcgttgagg tgtgttggccagctccctag 1320 tttgtgctta ctatacctgg ccacgcctcc ctacctaagg ccgctggcttaaccctaggg 1380 gcaggcagtg ttagatcaga cccagacctt ctcatcccac cctcatcacatcggggagag 1440 gggactccag gggcgggaag gcaggcgtcc ctccatttgg ccagggtgggcggcgaggag 1500 ggggtcactc tgcaggaaca ctgagctctg aacacctctc gcctgctgcctgcctcacac 1560 cctctgcatt cgctgtttcc tctgttgggg gagggggttt gtgaggggaatattagatta 1620 caccttgtca tttggaaagc cccgtgtctc cggcggccac agcgaggttgggggggtggt 1680 gagggaagtc catggattgg ccagaactgg gggaaaaaca aaaagaaatgagagaaagag 1740 agagcgggta ccaaaaaaaa aaaaaaaaa 1769 50 3062 DNA Homosapiens 50 agaagcgggc ggcgcggggg agatgcataa gcttaaatcg tctcagaaggacaaggtccg 60 ccagtttatg gcgtgcactc aggctggcga gagaactgct atctactgcttaacgcagaa 120 tgagtggaga ctagacgagg ccacggacag cttcttccaa aacccagactcgctccacag 180 ggagtccatg cggaacgctg tggacaagaa gaagctggag cggctgtacggcaggtacaa 240 agatccacaa gatgaaaaca aaattggagt cgatgggatt caacagttttgtgatgatct 300 gagcctggat cctgccagta tcagtgtatt ggtcatagcg tggaagttcagggcagcaac 360 tcagtgtgaa tttagcagaa aggaatttct agatggcatg acagaacttgggtgtgacag 420 catggagaag ctaaaggctc ttctgccaag actggagcag gagctgaaggacacagccaa 480 gtttaaagat ttttatcagt ttaccttcac cttcgctaag aacccagggcagaaaggttt 540 aggttcacct ccatttctca atgtgaaagc tttacatcat taagatgagttgaatataga 600 tttcaattaa tgttcttcct aagtgataag gatgtagact tataagcaggacaagactaa 660 tcatcttctt agcattttac tgcgggtccc atcgacttag aaatggctgttgcgtattgg 720 aaattagtgt tatctggaag gtttaaattt ttagatctct ggaacacattcttaatggaa 780 catcacaaaa gatcaattcc aagggacacc tggaacctcc tgctggactttggaaacatg 840 attgcggatg atatgtctaa ctacgatgaa gaaggagctt ggcccgttcttatagatgat 900 tttgtagaat atgcacggcc agtagtcaca ggtggaaaac gcagccttttctaggcagca 960 agttaagcag gagtaagatt atgaaatgat ttgtatcctg caaggagattgcagtcagtt 1020 cctgggtgca ttgtcgctga ttccagaagt cattcttgac cagccatgaaaccagaggcg 1080 ccatcccatt ctgccggagg acagccagcg gctgctttgt ggacaccgcaggaagttcct 1140 cgggacacgg ctgctttggg atgtttggag atttgtcatc atagcttttgcgttaggaaa 1200 tttctgcatg attttttaat atttacaaaa tactaaggta gagccatagcgccgcctgtg 1260 ggaccgcaga gcatgctgcg tagctcgcgc gtcaggcgaa ccagcgtccggcagcgtccc 1320 gccgaatgac gttgcggtgg cactggcaac acggcatgtg tcctctgcagggcgctgcgt 1380 ttttatacac gtcaaagctg ttaagaatgt gccctaaggg agaggatcttgtcgtagagt 1440 ctaatgtttt ttaaaattgg tgccagcaat tcacgattta tattttttgaattaccaaat 1500 atctagattt accagctcta tttttgtttt cattttctct agacattcatctgaaaatca 1560 ttttatggtt ctcaatcccc atgtagcttt gcatagcaac ggcacacgtggcacgattcc 1620 agcagagttt atctcacacc gtttatatat cactgggcct ctcttacttaaatattattt 1680 gacctgcctg agaagcttca taaagtatgt ttttttaaaa tatattttaattacatttaa 1740 aaagacattt ttccatgaaa aacatttatt ttatgagtga tgaattatagattttaaaat 1800 caaggccggg cgtggtggct cacacctgta atcccagcat tttgggaggccgaggcaggt 1860 ggatcacctg aggtcaggag tttgagacca gcctggccaa catggtgaaaccctgtctct 1920 actaaaaata caaaaattag ccgggcatgg tggcacatgc ctgtaatcccagctactcag 1980 gaggctgagg caggagattc gcttgaactc aggagacgga ggttacagtgagtcgagatc 2040 gcgccactgc actccagcct ggacgacaga gtgagattct gtctcaaaataaatacatac 2100 atacatacaa taaaaccgaa tgagctggtt ctttccatcc tcttgtggtacccgtagtgc 2160 ccagcatcca gacctttgtg cccctgtgca catttggaag ctaaaatgtacatcgttgtc 2220 tgaaaaaacc caaccccaaa accttcatct gattggtgag ctgaagtctgtccttgcacc 2280 atgttatcat ctgtttctcg tgtccgcctg gttgaggagg acccacgagtgctgccgagg 2340 tgtggagggc tggtattgag ttgtggacat cactgttgac cctacctcacgtgccgagac 2400 tctcatgtca caggcgtgcc ttgctgcccc cctgcagcac tgtgcaggacgtggaccagc 2460 tggagctgct gcccagcaca gaggagagtc gccgcagatg acctagctgcggtgtgagag 2520 agcatggccc agacaagcag ctgggttggc ttctgagaac aggacttaccctgggcttca 2580 ggaacatctg atggctgagg ttagtgtgct tggaggctgc aggacgaactgtcgatgttt 2640 cttagcagag atggtcacag agggcagcag ggacaggact ggaagggacctgcagcctgc 2700 agaccccgcc tggcccccgc tggcttctgg ctggtccagt gatgggcaagtgacagacct 2760 tccccaggct ctgcttccag aactctaatg ggaaactggg cctgtctacctttagaagtc 2820 ttcgattctc agagagcatt tgtctaatac aataaaaact ggcattaatacaaacctcaa 2880 aaacgtgagc gtatcttcca ggcttcatgg attcttgaca tgtaattgttttgttcagaa 2940 aagtttatag aattcacata attctgtata aactatggag atccacagtacttttttgtt 3000 tttgagattt aaagttctaa gggattgtca atagatatca aaatattaatcattggacaa 3060 ag 3062 51 1578 DNA Homo sapiens 51 ggtagtgaccctcgggcctc gccatgaaga gccgctttag caccattgac ctccgcgccg 60 tactcgcggagctgaatgct agcttgctag gaatgagagt aaacaatgtt tatgatgtgg 120 ataataagacataccttatt cgtcttcaaa aaccggactt taaagctaca cttttacttg 180 aatctggcatacgaattcat acaacagaat ttgagtggcc taagaatatg atgccgtcta 240 gttttgccatgaagtgccga aaacatttga agagtcggag attagtcagt gcaaaacagc 300 ttggtgtggatagaattgta gattttcaat ttggaagtga tgaagctgct taccatttaa 360 tcattgagctctatgatagg gggaacattg ttcttacaga ttatgagtac gtaattttaa 420 atattctaaggtttcgaact gatgaggcag atgatgttaa atttgctgtt cgtgaacgct 480 atccacttgatcatgctaga gctgctgaac ctttgcttac tttggaaagg ttgactgaaa 540 tagtagccagcgcacctaag ggtgaactac tgaagagggt gcttaaccca ttacttccct 600 atggaccagctctcattgaa cactgtcttt tagaaaatgg attctcgggt aatgtcaaag 660 tggatgaaaaacttgaaact aaagatattg aaaaagtact tgtttctctg cagaaagcag 720 aagactatatgaaaacaaca tccaacttca gtgggaaggg atatatcatt cagaaaagag 780 aaataaaaccatgcttggaa gcagataaac cagttgaaga catactgacg tatgaggaat 840 ttcatcctttcttgttttct caacattcac aatgtccata tatagaattt gaatcatttg 900 acaaggcggtggatgaattt tattccaaga tagaaggcca gaaaattgac ttaaaagctt 960 tacaacaggaaaagcaagca ttgaagaaat tagataatgt tcgaaaggat cacgaaaaca 1020 gattggaagctcttcagcag gctcaggaaa tagacaaact gaaaggagag ctcatagaaa 1080 tgaacctacaaatagttgac agagccattc aggtagttcg aagtgcttta gctaaccaga 1140 tagattggacagaaattggg ttaattgtga aagaagccca ggctcaagga gaccctgttg 1200 caagtgcaatcaaagaatta aaactacaaa caaaccatgt tacaatgctg ctaagaaatc 1260 catacttgttatcagaggag gaagatgatg atgttgatgg tgacgtcaat gttgagaaaa 1320 atgaaactgaaccaccaaaa ggaaaaaaga aaaaacaaaa gaataaacag ctgcagaagc 1380 ctcagaaaaataagccctta cttgtagatg ttgatctcag cttgtcagca tatgccaatg 1440 ccaaaaagtattatgatcac aagagatatg ctgctaagaa aacacaaaag actgttgaag 1500 ctgctgagaaggcattcaag tcagcagaaa agaaaacaaa gcaaacatta aaagaagttc 1560 agactgttacctctattc 1578 52 2956 DNA Homo sapiens 52 ttgcagagat aaatggttcagccctatgta gctacaacct aaagccttct gaatacacta 60 catctccaaa atcttctgttctctgcccca aactaccagt cccagcgagt gcacctattc 120 cattcttcca tcgctgtgctcctgtgaaca tttcctgcta tgccaagttt gcagaggccc 180 tgatcacctt tgtcagtgacaatagtgtct tacacaggct gattagtgga gtaatgacca 240 gcaaagaaat tatattgggactttgcttgt tatcactagt tctatccatg attttgatgg 300 tgataatcag gtatatatcaagagtacttg tgtggatctt aacgattctg gtcatactcg 360 gttcacttgg aggcacaggtgtactatggt ggctgtatgc aaagcaaaga aggtctccca 420 aagaaactgt tactcctgagcagcttcaga tagctgaaga caatcttcgg gccctcctca 480 tttatgccat ttcagctacagtgttcacag tgatcttatt cctgataatg ttggttatgc 540 gcaaacgtgt tgctcttaccatcgccttgt tccacgtagc tggcaaggtc ttcattcact 600 tgccactgct agtcttccaacccttctgga ctttctttgc tcttgtcttg ttttgggtgt 660 actggatcat gacacttctttttcttggca ctaccggcag tcctgttcag aatgagcaag 720 gctttgtgga gttcaaaatttctgggcctc tgcagtacat gtggtggtac catgtggtgg 780 gcctgatttg gatcagtgaatttattctag catgtcagca gatgacagtg gcaggagctg 840 tggtaacata ctattttactagggataaaa ggaatttgcc atttacacct attttggcat 900 cagtaaatcg ccttattcgttaccacctag gtacggtggc aaaaggatct ttcattatca 960 cattagtcaa aattccgcgaatgatcctta tgtatattca cagtcagctc aaaggaaagg 1020 aaaatgcttg tgcacgatgtgtgctgaaat cttgcatttg ttgcctttgg tgtcttgaaa 1080 agtgcctaaa ttatttaaatcagaatgcat acacagccac agctatcaac agcaccaact 1140 tctgcacctc agcaaaggatgcctttgtca ttctggtgga gaatgctttg cgagtggcta 1200 ccatcaacac agtaggagattttatgttat tccttggcaa ggtgctgata gtctgcagca 1260 caggtttagc tgggattatgctgctcaact accagcagga ctacacagta tgggtgctgc 1320 ctctgatcat cgtctgcctctttgctttcc tagtcgctca ttgcttcctg tctatttatg 1380 aaatggtagt ggatgtattattcttgtgtt ttgccattga tacaaaatac aatgatggga 1440 gccctggcag agaattctatatggataaag tgctgatgga gtttgtggaa aacagtagga 1500 aagcaatgaa agaagctggtaagggaggcg tcgctgattc cagagagcta aagccgatgg 1560 cttcgggagc aagttctgcttgaacctagc cgacggttat ggaaacccat tgacattcca 1620 aaacaatata tacacataactatgtatttg tgtgtgtggg tgtgtgtata tatgtatatg 1680 tatgtgtgta tatatgtatatgtatataca cacacacaca taaatcagcc aaaatcagag 1740 aaaaggaaca gggatttaatacctttttta tgcttatttt tgtcaaacat gtactccttt 1800 catacgggtg gcttttacaaggcaacttcc gtcatttaat gttttcaact gtaattgtct 1860 taatggaaat gttaaaattcatatctgatt aacattttta ataacttaga ggagatttta 1920 actttattta aaaataggtaaaattattgt acctaattat gtctaaagtt tattcagggg 1980 taatttccct gatgtctgtataaaatcaag atcttatttt actgatgcat aagtcctagt 2040 gggtcaagac taggcatatgctttcagata aataaggaat tactccaatc agttttcccc 2100 aatcaaagaa gccatgtcattttactttta gaaacataca attgggccca atatgggaat 2160 tttcataata gttcatacatttgtcagcca acattaaaag gtaaccaact cctcaggtat 2220 ttgtagttta ccctaacgcttctttaaaag aaagtaggta aaaaaagaaa agggtagata 2280 atctttcgta tgcaaacttttcccttatat tttgtctttc tttccttttt gactttagta 2340 gcatcctcca cacatttgtgtgcctgattt gaaaggaagc tggggcaccc agcgagttta 2400 gcctttaagt ttctgtgtattgatttgcag attaagtaat gctgagagga ataaagaagg 2460 gacagaaaca tggaacataaagcattgaaa attccggtgc ttgggcttcg gcttcagagt 2520 aacgtcagtg gcttagggttaaacggccat tttattcaaa tgcttgctat acaatctgaa 2580 aacacactgg caggtgctcctctccttggc aattcattga gtatccagag ttctacgatg 2640 tttaactgaa gaattggctaatgttttgat cctccagtgt gactgttgtt tttgtttggg 2700 ggtgggtttg gggtttttgcttttttattc ctgaagctta ccagatatga atggctaata 2760 ctccattgtt ctgcttgttgtaatggtgaa tgctttaaga aaaaaaagtg taatttgcta 2820 agaataattc atgatctgtttatgcgataa ctcctttttg ttacaatttt tttaaaaaaa 2880 gctatttttg ttaatgtaaagtaaatattt cagagcaaat tttttaaact tattgcacta 2940 aatacaggct ctgtac 295653 1861 DNA Homo sapiens 53 aagatgatga gcaaacaggc agctcttttg ggcaatgaagatacagctgt tgaggaacct 60 gtccctgaag ttgtaccagt acaagtagaa actgccaagaaatccaaaaa gccgagtaga 120 gaagttatca gctgcatgtt tgagcctgaa gggaatgcctgcagcttgac ggacagtacc 180 gcagaggagc acgtgctggc gctggtggag cacgcagctgacgaagctcg ggacaggatc 240 aaccggttcc tcccaggcgg caagatgggc tatctgaagaggaacgggga cgggagcctg 300 ctctacagcg tggtcaacac ggccgagccg gacgctgatgaggaggagac ccacccggtg 360 gacttgagct cgctctccag taagctactc ccaggcttcaccacgctggg ctttaaagac 420 gagagaagaa acaaagtcac ctttctctcc agtgccactactgcgctttc gatgcagaat 480 aattcagtat ttggcgactt gaagtcggac gagatggagctgctctactc agcctacgga 540 gatgagacag gcgtgcagtg tgcgctgagc ctgcaggagtttgtgaagga tgctgggagc 600 tacagcaaga aagtggtgga cgacctcctg gaccagatcacaggcggaga ccactctagg 660 acgctcttcc agctgaagca gagaagaaat gttcccatgaagcctccaga tgaagccaag 720 gttggggaca ccctaggaga cagcagcagc tctgttctggagttcatgtc gatgaagtcc 780 tatcccgacg tttctgtgga tatctccatg ctcagctctctggggaaggt gaagaaggag 840 ctggaccctg acgacagcca tttgaacttg gatgagacgacgaagctcct gcaggacctg 900 cacgaagcac aggcggagcg cggcggctct cggccgtcgtccaacctcag ctccctgtcc 960 aacgcctccg agagggacca gcaccacctg ggaagcccttctcgcctgag tgtcggggag 1020 cagccagacg tcacccacga cccctatgag tttcttcagtctccagagcc tgcggcctct 1080 gccaagacct aactctagac caccttcagc tcttttattttattttttta gttttatttt 1140 gcacgtgtag agtttttgtc atcagacaag gactttgatcctgtcccctt tggcatgcgg 1200 gaagcagccg cggggaggta atgaattgtc tgtggtatcatgtcagcaga gtctccaagc 1260 cccacgaacc cgtgaggagt ggagtcatac gcgaaggccatatggccatc gtgtcagcag 1320 agagagtctc tgtacacagc cccgtgaacc ctgaggagtggagtcataca cgaagggcgt 1380 gtggccatcg tgtcagcaga gagagtctct gtacacagccccgtgaaccc tgaggagtgg 1440 agtcatacgc gaagggtgtg tggccaggct gcagagctgcgtgccgtttg tgtccgagca 1500 tcacgtgtgg ctccagccct tgtttctgcc agtgtagacacctctgtctg ccccactgtc 1560 ctggggtcgc tcttgggagg cacaggcatg ggtgtgtctggcctcattct gtatcagtcc 1620 agtgtgttcc tgtcatagtt tgtgtctccc aggcaggccatggtaggggc ctcgcagggg 1680 ccattgggga gcacagggcc aggctggggt gaggagagctcccctgtttt ctgtttaatt 1740 gatgagcctg ggaaaggagt gtgttctgcc tgcccgttacagtggagcgt tccgtgtcca 1800 taaaacgttt tctaactggg aaaaaaaaaa aaaaaaaaaaaaaaaaaaaa aaaaaaaaaa 1860 a 1861 54 1758 DNA Homo sapiens 54 atcttttgtacaagaataca gaatgggaag aatgtacaaa atgaaaagac aggcaaacaa 60 atgtactttccttgcactat ttctataaca ccatatagga tgtggcatta tccagtacat 120 tttatcagtgtagtctattc attctggtct aaaacggtaa ttttggctga ataaccccca 180 aatctaactttgctttaggc tttataatta attgatgctt gattcttctt tctacatctt 240 ttaaaaatgaaattggacac tagtgttttt ctttaaaata actctggtac tagtaagtta 300 gcaaagtctatttttaaagg acactaatac ataaagttta ttttttcttt caatatccta 360 aataaagggacaatgggagt catatcacct ttgggcctac tctgaaatta ccaatagtgc 420 taaaagctaagatattgtga tctagaactt aaactttgga aaacaacccc atctttctat 480 ggcacattgaggaactgaag tactttacct catttcctac caatcatttt aagagaattt 540 ggttgtatttcaaagaacaa aacaacacaa tttctgtcct gctgtttatt ttgcagacca 600 cacacaaagttaaatatgaa cagattaaat tatgacatca tacaaatata ggcacaaatt 660 aagtggacgccacaaaaggt gtgtcagaaa acactgaatt gactgaatct gagtcagatt 720 tccccctcgtggactttagg atgccctcag ctatcactgc ctcgctggga aagagcctga 780 ctttcccgttctgccctgct gcagtatcct tcaggggttc caaaaacacc actcttttac 840 ctgcacctgccttccgttca tcagcggagg catcactagc ggggccagga ctgagaatcg 900 atgaatgggcattgctttgg tgtagcatat ttttctgtct cttggtttta cacttgcagg 960 ggcatggagtcagatagagg tacaaaagta ccaaaacgat actggccacg caagcagcaa 1020 gagtggtaaaagctgtgtta aatgcctcat gagcatggga tctgcttaca gtgaaattgc 1080 tcacatttattgtgacgtcc acagtttcat ttaacaggcg ttgcttattc attgcgatac 1140 aagaatacactccagcatcc tcaaaacgag ggctttctat aaccagactt ccattgtgaa 1200 acacgtaaaagttttccatc tctttatccg gctctagcag tctgttatct ggacccaccc 1260 agatgaaatccgtatttgca ttacctgtct tgctgtcaca gtggaccatc agtctttccc 1320 cgacctgagcctcatgaata aagccaagcg cacgaaagga accattgatg atgctgtcag 1380 agcaattcataaagctatcc tggagcagaa gtacctgacg cgagtgcctg gagtcagacc 1440 acaggcgacaggtgtaatcg ttcttaaaat ccatcactga gctaaagtgc ctacgatacc 1500 aaaagaccagcaaggagtac agggaacagt cacagacaaa tgggtttcca tgaaggtaga 1560 tgcctctcagctgttttcct ggcactaaat ttatgtggtg cattggcatg gaaggaattc 1620 ggttataagaaacatctaaa aacatcagtt ctgccagctt gaaccttcca acatacaaat 1680 ccatcggaaactgtgtgaga aaatttccac ttaagtagag tttctgcaac tgggagagcc 1740 ctccaaacgctgaaggat 1758 55 742 DNA Homo sapiens 55 cagatctata gtcatttaaatatttccaat cagcatttac gtaaacaagt atactaaaaa 60 tagtagtttt ggttgagatttttgcacatt aacttccttt agtttgtaag aatcatgcat 120 tttggatgtg ttctcttatctgacaacaca gttttaaatg ctgtctttat ctgttttgtt 180 tgttttacta ataacacgtagttttgaacc agtaattact taacagctag gcatggggga 240 gacagaattg gactgagcaccaatcattgt tctaccttca gtgagctgtg cagccatagg 300 caactcccag cttctgctgtgcacctgtct tgagctggaa acaggtggcg ggaagcattc 360 tcaaaggccc cttctagcactaactttgta tgttcagtaa atatcagttc cctggaccag 420 ctccttttat tctggtacagaattattctt agcctatggg gtgggggtgg ggggacagta 480 gtgtctatta tttgtgaattttggaaccag tgtcattact ttacagtcaa ggaggcatgt 540 tagagtgttt ggatttttatcccttgataa gggcccacat ccccacatac agagatgctc 600 aattccaatt gaaaaacctcagaatttcta agtgtcaaag caatagtcta gatttttttc 660 tgaaaattca agaaatgtgttttctcaaat ctcttttttt atcctttctc ataatcctgc 720 cacttaatgc ccagtttgga tc742 56 259 DNA Homo sapiens 56 ttattacctt ctaacatact atataatttacttgtttgcc tcccccctgc taaactgtaa 60 accccaagag gacaaggacc tttgtgtattttgtttatta gtgtgtccca agcatacaaa 120 acagtgatta tccaaatatg tattaaatggcaacctaatt ctgaaactgg atttttgttt 180 tggagtgtat gctgaatcat ggtagcattgaccttcagtt ttgatatctg tgtgggttcc 240 tagggcctag tcaagatgg 259 57 655 DNAHomo sapiens 57 gatcatcgct tgaggccagg agttgaagac cagcctgcgc agcatagcaagaccccatct 60 ctacaaaaat aaattttaaa aacaaagcaa aaatgttaat tggttcctcataaatttgaa 120 atataaatgc acattttggg agtgtgttag aaagataacg aaagtctgattttagccttg 180 gttctgccac taaaggacac tttgggtact tctgaacttc ttgaacctaaatttccttat 240 cttccttagt agcttcatag tttttggtga gaatcaagag agatgcattttgtaaattgc 300 aagctgtcat cttatttaat cctcatgaaa agtgaatgtt agatgatattagcatcctag 360 ggttatatta gacgttaaac tgtcagggtc agagagctgg tgtgaagttgagccaggact 420 taattcatag ctatttcacc actgagttta tggatgcttt tccattatacaccttaacac 480 ttcctgtgat gactaattaa gttgtatatt gtagggaggg ttattggacccgtttcgtaa 540 ggctattctc tataaataag aattttcagt gttcagtatg ccctactcgttactcacttt 600 ttaatccctt gtaatccagt ttaagccatt actgccctaa aaattacttagatct 655 58 573 DNA Homo sapiens 58 atgaggatct ttgttaatat ggagtaacgacatcataaac aatcttttct actgtccttt 60 tttatttacg ttcaattttt tgaacaggcaatacagtcat acaatccaaa agatatgaaa 120 agacataata gtaatgtctc attctcatccgtttccctta ggtacccatt tcccttcccc 180 acagacacat actttgttaa tagtttttcttccagagatg atatatgcat atgtacaagc 240 aaatgcaaat gtttttattt ctttttaacaccagtagtaa tatattacat atatacatac 300 tgcactgtgt atataatgta cactattgtttcctgctttt tacatttata cttaatatgt 360 atttttcaga tactgtcatc ttactatatataaaaagcta actcatttgt ttttcagctc 420 tctcatattt gatggtatga ataaaacaaatttagccagt tttgtattaa gaggtttctt 480 tgattgtagt tttttgctgt tactaacaatgctgcagtgg ttgggcgcgt ggctcacacc 540 tgtaatctca gcactttgga aggctgaggcggg 573 59 594 DNA Homo sapiens 59 gatcattctc acaacataac tatgcatgtagaggacaaga tttattttct ttcctccctt 60 tgcccagtag ccacatctgg tttactcaggcagcatctac taagaaattc agcacctgca 120 tatctctgtg acatggtcac ttagagcttatcttccctat gaatctccag atctgtgagt 180 cgagcagatt tcatgttgca gattcacctttaatgcaaag actgtattat cctcacatga 240 ctttttttct tgtcttactg taccttaaaaggtgatagag taattctgta ttttctaacg 300 ggaagattca aaggagctga atgtgttatgcttccaaaca actgaatgta aaacactcct 360 agccagttgt tgcattccct atatttatttacttccaata ttttactgta aaagtaggga 420 gaaatattat gttgatagtt gtttcatattctctcaggaa ctttaatgtt cccgactcgg 480 gtgattccag ctgtgttgct ggcagtgttgtctcaaccct ctccctaaaa tgactgagcc 540 ctgggttcat ctaatgtggt tttccttaggaagagataga aggcacagaa gatc 594 60 1080 DNA Homo sapiens 60 atgaaggccactatcatcct ccttctgctt gcacaagttt cctgggctgg accgtttcaa 60 cagagaggcttatttgactt tatgctagaa gatgaggctt ctgggatagg cccagaagtt 120 cctgatgaccgcgacttcga gccctcccta ggcccagtgt gccccttccg ctgtcaatgc 180 catcttcgagtggtccagtg ttctgatttg ggtctggaca aagtgccaaa ggatcttccc 240 cctgacacaactctgctaga cctgcaaaac aacaaaataa ccgaaatcaa agatggagac 300 tttaagaacctgaagaacct tcacgcattg attcttgtca acaataaaat tagcaaagtt 360 agtcctggagcatttacacc tttggtgaag ttggaacgac tttatctgtc caagaatcag 420 ctgaaggaattgccagaaaa aatgcccaaa actcttcagg agctgcgtgc ccatgagaat 480 gagatcaccaaagtgcgaaa agttactttc aatggactga accagatgat tgtcatagaa 540 ctgggcaccaatccgctgaa gagctcagga attgaaaatg gggctttcca gggaatgaag 600 aagctctcctacatccgcat tgctgatacc aatatcacca gcattcctca aggtcttcct 660 ccttcccttacggaattaca tcttgatggc aacaaaatca gcagagttga tgcagctagc 720 ctgaaaggactgaataattt ggctaagttg ggattgagtt tcaacagcat ctctgctgtt 780 gacaatggctctctggccaa cacgcctcat ctgagggagc ttcacttgga caacaacaag 840 cttaccagagtacctggtgg gctggcagag cataagtaca tccaggttgt ctaccttcat 900 aacaacaatatctctgtagt tggatcaagt gacttctgcc cacctggaca caacaccaaa 960 aaggcttcttattcgggtgt gagtcttttc agcaacccgg tccagtactg ggagatacag 1020 ccatccaccttcagatgtgt ctacgtgcgc tctgccattc aactcggaaa ctataagtaa 1080 61 1923 DNAHomo sapiens 61 ccgcgccgct ccccgttgcc ttccaggact gagaaagggg aaagggaagggtgccacgtc 60 cgagcagccg ccttgactgg ggaagggtct gaatcccacc cttggcattgcttggtggag 120 actgagatac ccgtgctccg ctcgcctcct tggttgaaga tttctccttccctcacgtga 180 tttgagcccc gtttttattt tctgtgagcc acgtcctcct cgagcggggtcaatctggca 240 aaaggagtga tgcgcttcgc ctggaccgtg ctcctgctcg ggcctttgcagctctgcgcg 300 ctagtgcact gcgcccctcc cgccgccggc caacagcagc ccccgcgcgagccgccggcg 360 gctccgggcg cctggcgcca gcagatccaa tgggagaaca acgggcaggtgttcagcttg 420 ctgagcctgg gctcacagta ccagcctcag cgccgccggg acccgggcgccgccgtccct 480 ggtgcagcca acgcctccgc ccagcagccc cgcactccga tcctgctgatccgcgacaac 540 cgcaccgccg cggcgcgaac gcggacggcc ggctcatctg gagtcaccgctggccgcccc 600 aggcccaccg cccgtcactg gttccaagct ggctactcga catctagagcccgcgaagct 660 ggcgcctcgc gcgcggagaa ccagacagcg ccgggagaag ttcctgcgctcagtaacctg 720 cggccgccca gccgcgtgga cggcatggtg ggcgacgacc cttacaacccctacaagtac 780 tctgacgaca acccttatta caactactac gatacttatg aaaggcccagacctgggggc 840 aggtaccggc ccggatacgg cactggctac ttccagtacg gtctcccagacctggtggcc 900 gacccctact acatccaggc gtccacgtac gtgcagaaga tgtccatgtacaacctgaga 960 tgcgcggcgg aggaaaactg tctggccagt acagcataca gggcagatgtcagagattat 1020 gatcacaggg tgctgctcag atttccccaa agagtgaaaa accaagggacatcagatttc 1080 ttacccagcc gaccaagata ttcctgggaa tggcacagtt gtcatcagacattaccacag 1140 tatggatgag tttagccact atgacctgct tgatgccaac acccagaggagagtggctga 1200 aggccacaaa gcaagtttct gtcttgaaga cacatcctgt gactatggctaccacaggcg 1260 atttgcatgt actgcacaca cacagggatt gagtcctggc tgttatgatacctatggtgc 1320 agacatagac tgccagtgga ttgatattac agatgtaaaa cctggaaactatatcctaaa 1380 ggtcagtgta aaccccagct acctggttcc tgaatctgac tataccaacaatgttgtgcg 1440 ctgtgacatt cgctacacag gacatcatgc gtatgcctca ggctgcacaatttcaccgta 1500 ttagaaggca aagcaaaact cccaatggat aaatcagtgc ctggtgttctgaagtgggaa 1560 aaaatagact aacttcagta ggatttatgt attttgaaaa agagaacagaaaacaacaaa 1620 agaatttttg tttggactgt tttcaataac aaagcacata actggattttgaacgcttaa 1680 gtcatcatta cttgggaaat ttttaatgtt tattatttac atcactttgtgaattaacac 1740 agtgtttcaa ttctgtaatt acatatttga ctctttcaaa gaaatccaaatttctcatgt 1800 tccttttgaa attgtagtgc aaaatggtca gtattatcta aatgaatgagccaaaatgac 1860 tttgaactga aacttttcta aagtgctgga actttagtga aacataataataatgggttt 1920 ata 1923 62 3488 DNA Homo sapiens Unsure (503)..(616) a,or c, or g, or t 62 atggggggat gcacggtgaa gcctcagctg ctgctcctggcgctcgtcct ccacccctgg 60 aatccctgtc tgggtgcgga ctcggagaag ccctcgagcatccccacaga taaattatta 120 gtcataactg tagcaacaaa agaaagtgat ggattccatcgatttatgca gtcagccaaa 180 tatttcaatt atactgtgaa ggtccttggt caaggagaagaatggagagg tggtgatgga 240 attaatagta ttggaggggg ccagaaagtg agattaatgaaagaagtcat ggaacactat 300 gctgatcaag atgatctggt tgtcatgttt actgaatgctttgatgtcat atttgctggt 360 ggtccagaag aagttctaaa aaaattccaa aaggcaaaccacaaagtggt ctttgcagca 420 gatggaattt tgtggccaga taaaagacta gcagacaagtatcctgttgt gcacattggg 480 aaacgctatc tgaattcagg agnnnnnnnn nnnnnnnnnnnnnnnnnnnn nnnnnnnnnn 540 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnnnnnnnnnnnn nnnnnnnnnn 600 nnnnnnnnnn nnnnngaagc tattaacatc acattggatcacaaatgcaa aattttccag 660 accttaaatg gagctgtaga tgaagttgtt ttaaaatttgaaaatggcaa agccagagct 720 aagaatacat tttatgaaac attaccagtg gcaattaatggaaatggacc caccaagatt 780 ctcctgaatt attttggaaa ctatgtaccc aattcatggacacaggataa tggctgcact 840 ctttgtgaat tcgatacagt cgacttgtct gcagtagatgtccatccaaa cgtatcaata 900 ggtgttttta ttgagcaacc aacccctttt ctacctcggtttctggacat attgttgaca 960 ctggattacc caaaagaagc acttaaactt tttattcataacaaagtaag tttatcatga 1020 aaaggacatc aaggtatttt ttgataaagc taagcatgaaatcaaaacta taaaaatagt 1080 aggaccagaa gaaaatctaa gtcaagcgga agccagaaacatgggaatgg acttttgccg 1140 tcaggatgaa aagtgtgatt attactttag tgtggatgcagatgttgttt tgacaaatcc 1200 aaggacttta aaaattttga ttgaacaaaa cagaaagatcattgctcctc ttgtaactcg 1260 tcatggaaag ctgtggtcca atttctgggg agcattgagtcctgatggat actatgcacg 1320 atctgaagat tatgtggata ttgttcaagg gaatagagtaggagtatgga atgtcccata 1380 tatggctaat gtgtacttaa ttaaaggaaa gacactccgatcagagatga atgaaaggaa 1440 ctattttgtt cgtgataaac tggatcctga tatggctctttgccgaaatg ctagagaaat 1500 gggtgtattt atgtacattt ctaatagaca tgaatttggaaggctattat ccactgctaa 1560 ttacaatact tcccattata acaatgacct ctggcagatttttgaaaatc ctgtggactg 1620 gaaggaaaag tatataaacc gtgattattc aaagattttcactgaaaata tagttgaaca 1680 gccctgtcca gatgtctttt ggttccccat attttctgaaaaagcctgtg atgaattggt 1740 agaagaaatg gaacattacg gcaaatggtc tgggggaaaacatcatgata gccgtatatc 1800 tggtggttat gaaaatgtcc caactgatga tatccacatgaagcaagttg atctggagaa 1860 tgtatggctt cattttatcc gggagttcat tgcaccagttacactgaagg tctttgcagg 1920 ctattatacg aagggatttg cactactgaa ttttgtagtaaaatactccc ctgaacgaca 1980 gcgttctctt cgtcctcatc atgatgcttc tacatttaccataaacattg cacttaataa 2040 cgtgggagaa gactttcagg gaggtggttg caaatttctaaggtacaatt gctctattga 2100 gtcaccacga aaaggctgga gcttcatgca tcctgggagactcacacatt tgcatgaagg 2160 acttcctgtt aaaaatggaa caagatacat tgcagtgtcatttatagatc cctaagttat 2220 ttacttttca ttgaattgaa atttattttg gatgaatgactggcatgaac acgtctttga 2280 agttgtggct gagaagatga gaggaatatt taaataacatcaacagaaca acttcacttt 2340 gggccaaaca tttgaaaaac tttttataaa aaattgtttgatatttctta atgtctgctc 2400 tgagccttaa aacacagatt gaagaagaaa agaaagaaaaaacttaaata tttatttcta 2460 tgctttgttg cctctgagaa taatgacaat ttatgaatttgtgtttcaaa ttgataaaat 2520 atttaggtac aaataacaag actaataata ttttcttatttaaaaaaagc atgggaagat 2580 ttttatttat caaaatatag aggaaatgta gacaaaatggatataaatga aaattaccat 2640 gttgtaaaac cttgaaaatc agattctaac tggatttgtatgcaactaag tatttttctg 2700 aacacctatg caggtcttat ttacagtagt tactaagggaacacacaaag aattacacaa 2760 cgttttcctc aagaaaatgg tacaaaacac aaccgaggagcgtatacagt tgaaaacatt 2820 tttgttttga ttggaaggca gattatttta tattagtattaaaaatcaaa ccctatgttt 2880 ctttcagatg aatcttccaa agtggattat attaagcaggtattagattt aggaaaacct 2940 ttccatttct taaagtatta tcaagtgtca agatcagcaagtgtccttaa gtcaaacagg 3000 ttttttttgt tgttgttttt gctttgtttc cttttttagaaagttctaga aaataggaaa 3060 acgaaaaatt tcattgagat gagtagtgca tttaattattttttaaaaaa ctttttaagt 3120 acttgaattt tatatcagga aaacaaagtt gttgagccttgcttcttccg ttttgccctt 3180 tgtctcgctc cttattcttt ttttgggggg agggttatttgcttttttat cttcctggca 3240 taatttccat tttattcttc tgagtgtcta tgttaacttccctctatccc gcttataaaa 3300 aaattctcca acaaaaatac ttgttgactt gatgttttatcacttctcta agtaaggttg 3360 aaatatcctt attgtagcta ctgtttttaa tgtaaaggttaaacttgaaa agaaattctt 3420 aatcacggtg ccaaaattca ttttctaaca ccatgtgttagaaaattata aaaaataaaa 3480 taatttta 3488 63 2720 DNA Homo sapiens 63agcgggctga gggtaggaag tagccgctcc gagtggaggc gactgggggc tgaagagcgc 60gccgccctct cgtcccactt ttcaggtgtg tgatcctgta aaattaaatc ttccaagatg 120atctggtata tattaattat aggaattctg cttccccagt ctttggctca tccaggcttt 180tttacttcaa ttggtcagat gactgatttg atccatactg agaaagatct ggtgacttct 240ctgaaagatt atattaaggc agaagaggac aagttagaac aaataaaaaa atgggcagag 300aagttagatc ggctaactag tacagcgaca aaagatccag aaggatttgt tgggcatcca 360gtaaatgcat tcaaattaat gaaacgtctg aatactgagt ggagtgagtt ggagaatctg 420gtccttaagg atatgtcaga tggctttatc tctaacctaa ccattcagag acagtacttt 480cctaatgatg aagatcaggt tggggcagcc aaagctctgt tacgtctcca ggatacctac 540aatttggata cagataccat ctcaaagggt aatcttccag gagtgaaaca caaatctttt 600ctaacggctg aggactgctt tgagttgggc aaagtggcct atacagaagc agattattac 660catacggaac tgtggatgga acaagcccta aggcaactgg atgaaggcga gatttctacc 720atagataaag tctctgttct agattatttg agctatgcgg tatatcagca gggagacctg 780gataaggcac ttttgctcac aaagaagctt cttgaactag atcctgaaca tcagagagct 840aatggtaact taaaatattt tgagtatata atggctaaag aaaaagatgt caataagtct 900gcttcagatg accaatctga tcagaaaact acaccaaaga aaaaaggggt tgctgtggat 960tacctgccag agagacagaa gtacgaaatg ctgtgccgtg gggagggtat caaaatgacc 1020cctcggagac agaaaaaact cttttgccgc taccatgatg gaaaccgtaa tcctaaattt 1080attctggctc cagctaaaca ggaggatgaa tgggacaagc ctcgtattat tcgcttccat 1140gatattattt ctgatgcaga aattgaaatc gtcaaagacc tagcaaaacc aaggctgagg 1200cgagccacca tttcaaaccc aataacagga gacttggaga cggtacatta cagaattagc 1260aaaagtgcct ggctctctgg ctatgaaaat cctgtggtgt ctcgaattaa tatgagaata 1320caagatctaa caggactaga tgtttccaca gcagaggaat tacaggtagc aaattatgga 1380gttggaggac agtatgaacc ccattttgac tttgcacgga aagatgagcc agatgctttc 1440aaagagctgg ggacaggaaa tagaattgct acatggctgt tttatatgag tgatgtgtct 1500gcaggaggag ccactgtttt tcctgaagtt ggagctagtg tttggcccaa aaaaggaact 1560gctgttttct ggtataatct gtttgccagt ggagaaggag attatagtac acggcatgca 1620gcctgtccag tgctagttgg caacaaatgg gtatccaata aatggctcca tgaacgtgga 1680caagaatttc gaagaccttg tacgttgtca gaattggaat gacaaacagg cttccctttt 1740tctcctattg ttgtactctt atgtgtctga tatacacatt tcctagtctt aactttcagg 1800agtttacaat tgactaacac tccatgattg attcagtcat gaacctcatc ccatgtttca 1860tctgtggaca attgcttact ttgtgggttc ttttaaaagt aacacgaaat catcatattg 1920cataaaacct taaagttctg ttggtatcac agaagacaag gcagagttta aagtgaggaa 1980ttttatattt aaagaacttt ttggttggat aaaaacataa tttgagcatc cagttttagt 2040atttcactac atctcagttg gtgggtgtta agctagaatg ggctgtgtga taggaaacaa 2100atgccttaca gatgtgccta ggtgttctgt ttacctagtg tcttactctg ttttctggat 2160ctgaagacta gtaataaact aggacactaa ctgggttcca tgtgattgcc ctttcatatg 2220atcttctaag ttgatttttt tcctcccaag tcttttttaa agaaagtata ctgtatttta 2280ccaaccccct ctcttttctt ttagctcctc tgtggtgaat taaacgtact tgagttaaaa 2340tatttcgatt tttttttttt ttttaatgga aagtcctgca taacaacact gggccttctt 2400aactaaaatg ctcaccactt agcctgtttt tttatccctt ttttaaaatg acagatgatt 2460ttgttcagga attttgctgt ttttcttagt gctaatacct tgcctcttat tcctgctaca 2520gcagggtggt aatattggca ttctgattaa atactgtgcc ttaggagact ggaagtttaa 2580aaatgtacaa gtcctttcag tgatgaggga attgattttt tttaaaagtc tttttcttag 2640aaagccaaaa tgtttgtttt tttaagattc tgaaatgtgt tgtgacaaca atgacctatt 2700tatgatctta aatctttttt 2720 64 1506 DNA Homo sapiens 64 agcactcccggagcctgcaa cgcttgagat cctctccgcg cccgccaccc cgcagggtgc 60 cccgcgccgttcccgccgcc ccgccgcccc cgtcgcgggc ccctgcaccc cgagcatccg 120 ccccgggtggcacgtccccg agcccaccag gccggccccg tctccccatc cgtctagtcc 180 gctcgcggtgccatgccatt cctcgggcag gactggcggt cccccgggca gaactgggtg 240 aagacggccgacggctggaa gcgcttcctg gatgagaaga gcggcagttt cgtgagcgac 300 ctcagcagttactgcaacaa ggaggtatac aataaggaga atcttttcaa cagcctgaac 360 tatgatgttgcagccaagaa gagaaagaag gacatgctga atagcaaaac caaaactcag 420 tatttccaccaagaaaaatg gatctatgtt cacaaaggaa gtactaaaga gcgccatgga 480 tattgcaccctgggggaagc tttcaacaga ctggacttct caactgccat tctggattcc 540 agaagatttaactacgtggt ccggctgttg gagctgatag caaagtcaca gctcacatcc 600 ctgagtggcatcgcccaaaa gaacttcatg aatattttgg aaaaagtggt actgaaagtc 660 cttgaagaccagcaaaacat tagactaata agggaactac tccagaccct ctacacatcc 720 ttatgtacactggtccaaag agtcggcaag tctgtgctgg tcgggaacat taacatgtgg 780 gtgtatcggatggagacgat tctccactgg cagcagcagc tgaacaacat tcagatcacc 840 aggcctgccttcaaaggcct caccttcact gacctgcctt tgtgcctaca actgaacatc 900 atgcagaggctgagcgacgg gcgggacctg gtcagcctgg gccaggctgc ccccgacctg 960 cacgtgctcagcgaagaccg gctgctgtgg aagaaactct gccagtacca cttctccgag 1020 cggcagatccgcaaacgatt aattctgtca gacaaagggc agctggattg gaagaagatg 1080 tatttcaaacttgtccgatg ttacccaagg aaagagcagt atggagatac ccttcagctc 1140 tgcaaacactgtcacatcct ttcctggaag ggcactgacc atccgtgcac tgccaataac 1200 ccagagagctgctccgtttc actttcaccc caggacttta tcaacttgtt caagttctga 1260 atcccagcacacgacaacac ttcagaaggg tccccctgct gactggagag ctgggaatat 1320 ggcatttggacacttcattt gtaaatagtg tacattttaa acattggctc gaaacttcag 1380 agataagtcatggagaggac attggagggg agaaatgcag ttgctgactg ggaatttaag 1440 aatgtgaacttctcactaga attggtatgg aaaagcaaaa tactgtaaat aaactttttt 1500 tctaac 150665 4204 DNA Homo sapiens 65 caggacaggg aagagcgggc gctatgggga gccggacgccagagtcccct ctccacgccg 60 tgcagctgcg ctggggcccc cggcgccgac ccccgctcgtgccgctgctg ttgctgctcg 120 tgccgccgcc acccagggtc gggggcttca acttagacgcggaggcccca gcagtactct 180 cggggccccc gggctccttc ttcggattct cagtggagttttaccggccg ggaacagacg 240 gggtcagtgt gctggtggga gcacccaagg ctaataccagccagccagga gtgctgcagg 300 gtggtgctgt ctacctctgt ccttggggtg ccagccccacacagtgcacc cccattgaat 360 ttgacagcaa aggctctcgg ctcctggagt cctcactgtccagctcagag ggagaggagc 420 ctgtggagta caagtccttg cagtggttcg gggcaacagttcgagcccat ggctcctcca 480 tcttggcatg cgctccactg tacagctggc gcacagagaaggagccactg agcgaccccg 540 tgggcacctg ctacctctcc acagataact tcacccgaattctggagtat gcaccctgcc 600 gctcagattt cagctgggca gcaggacagg gttactgccaaggaggcttc agtgccgagt 660 tcaccaagac tggccgtgtg gttttaggtg gaccaggaagctatttctgg caaggccaga 720 tcctgtctgc cactcaggag cagattgcag aatcttattaccccgagtac ctgatcaacc 780 tggttcaggg gcagctgcag actcgccagg ccagttccatctatgatgac agctacctag 840 gatactctgt ggctgttggt gaattcagtg gtgatgacacagaagacttt gttgctggtg 900 tgcccaaagg gaacctcact tacggctatg tcaccatccttaatggctca gacattcgat 960 ccctctacaa cttctcaggg gaacagatgg cctcctactttggctatgca gtggccgcca 1020 cagacgtcaa tggggacggg ctggatgact tgctggtgggggcacccctg ctcatggatc 1080 ggacccctga cgggcggcct caggaggtgg gcagggtctacgtctacctg cagcacccag 1140 ccggcataga gcccacgccc acccttaccc tcactggccatgatgagttt ggccgatttg 1200 gcagctcctt gacccccctg ggggacctgg accaggatggctacaatgat gtggccatcg 1260 gggctccctt tggtggggag acccagcagg gagtagtgtttgtatttcct gggggcccag 1320 gagggctggg ctctaagcct tcccaggttc tgcagcccctgtgggcagcc agccacaccc 1380 cagacttctt tggctctgcc cttcgaggag gccgagacctggatggcaat ggatatcctg 1440 atctgattgt ggggtccttt ggtgtggaca aggctgtggtatacaggggc cgccccatcg 1500 tgtccgctag tgcctccctc accatcttcc ccgccatgttcaacccagag gagcggagct 1560 gcagcttaga ggggaaccct gtggcctgca tcaaccttagcttctgcctc aatgcttctg 1620 gaaaacacgt tgctgactcc attggtttca cagtggaacttcagctggac tggcagaagc 1680 agaagggagg ggtacggcgg gcactgttcc tggcctccaggcaggcaacc ctgacccaga 1740 ccctgctcat ccagaatggg gctcgagagg attgcagagagatgaagatc tacctcagga 1800 acgagtcaga atttcgagac aaactctcgc cgattcacatcgctctcaac ttctccttgg 1860 acccccaagc cccagtggac agccacggcc tcaggccagccctacattat cagagcaaga 1920 gccggataga ggacaaggct cagatcttgc tggactgtggagaagacaac atctgtgtgc 1980 ctgacctgca gctggaagtg tttggggagc agaaccatgtgtacctgggt gacaagaatg 2040 ccctgaacct cactttccat gcccagaatg tgggtgagggtggcgcctat gaggctgagc 2100 ttcgggtcac cgcccctcca gaggctgagt actcaggactcgtcagacac ccagggaact 2160 tctccagcct gagctgtgac tactttgccg tgaaccagagccgcctgctg gtgtgtgacc 2220 tgggcaaccc catgaaggca ggagccagtc tgtggggtggccttcggttt acagtccctc 2280 atctccggga cactaagaaa accatccagt ttgacttccagatcctcagc aagaatctca 2340 acaactcgca aagcgacgtg gtttcctttc ggctctccgtggaggctcag gcccaggtca 2400 ccctgaacgg tgtctccaag cctgaggcag tgctattcccagtaagcgac tggcatcccc 2460 gagaccagcc tcagaaggag gaggacctgg gacctgctgtccaccatgtc tatgagctca 2520 tcaaccaagg ccccagctcc attagccagg gtgtgctggaactcagctgt ccccaggctc 2580 tggaaggtca gcagctccta tatgtgacca gagttacgggactcaactgc accaccaatc 2640 accccattaa cccaaagggc ctggagttgg atcccgagggttccctgcac caccagcaaa 2700 aacgggaagc tccaagccgc agctctgctt cctcgggacctcagatcctg aaatgcccgg 2760 aggctgagtg tttcaggctg cgctgtgagc tcgggcccctgcaccaacaa gagagccaaa 2820 gtctgcagtt gcatttccga gtctgggcca agactttcttgcagcgggag caccagccat 2880 ttagcctgca gtgtgaggct gtgtacaaag ccctgaagatgccctaccga atcctgcctc 2940 ggcagctgcc ccaaaaagag cgtcaggtgg ccacagctgtgcaatggacc aaggcagaag 3000 gcagctatgg cgtcccactg tggatcatca tcctagccatcctgtttggc ctcctgctcc 3060 taggtctact catctacatc ctctacaagc ttggattcttcaaacgctcc ctcccatatg 3120 gcaccgccat ggaaaaagct cagctcaagc ctccagccacctctgatgcc tgagtcctcc 3180 caatttcaga ctcccattcc tgaagaacca gtccccccaccctcattcta ctgaaaagga 3240 ggggtctggg tacttcttga aggtgctgac ggccagggagaagctcctct ccccagccca 3300 gagacatact tgaagggcca gagccagggg ggtgaggagctggggatccc tcccccccat 3360 gcactgtgaa ggacccttgt ttacacatac cctcttcatggatgggggaa ctcagatcca 3420 gggacagagg cccagcctcc ctgaagcctt tgcattttggagagtttcct gaaacaactg 3480 gaaagataac taggaaatcc attcacagtt ctttgggccagacatgccac aaggacttcc 3540 tgtccagctc caacctgcaa agatctgtcc tcagccttgccagagatcca aaagaagccc 3600 ccagtaagaa cctggaactt ggggagttaa gacctggcagctctggacag ccccaccctg 3660 gtgggccaac aaagaacact aactatgcat ggtgccccaggaccagctca ggacagatgc 3720 cacaaggata gatgctggcc cagggccaga gcccagctccaaggggaatc agaactcaaa 3780 tggggccaga tccagcctgg ggtctggagt tgatctggaacccagactca gacattggca 3840 ccaatccagg cagatccagg actatatttg ggcctgctccagacctgatc ctggaggccc 3900 agttcaccct gatttaggag aagccaggaa tttcccaggacctgaagggg ccatgatggc 3960 aacagatctg gaacctcagc ctggccagac acaggccctccctgttcccc agagaaaggg 4020 gagcccactg tcctgggcct gcagaatttg ggttctgcctgccagctgca ctgatgctgc 4080 ccctcatctc tctgcccaac ccttccctca ccttggcaccagacacccag gacttattta 4140 aactctgttg caagtgcaat aaatctgacc cagtgcccccactgaccaga actagaaaaa 4200 aaaa 4204 66 1733 DNA Homo sapiens 66gcacgttccg cggggactca tgccacgcgc gtcccggccc gacgcgcaat tagcagccac 60ctccgcagcc cgccgccacc gcctccctgc cctcccgggc tgccgcagct aggagctcca 120gccgtcgcct cgcgcaggct gcgggcattg tcctctcggt tcgccgcccg ggctgctgct 180gccgccgcgg actgctgcgg ggcccggacc cgcaccccag ggatacgctg ccgccgccgc 240cggccggccc ggcgcccggc ctccgttcgg tggtttccgc cctgcgttct ctgggttgct 300ctctcctggg tttttcctgc gtagctgagg aaggggaaga gaagtccagc cgccaagccc 360agccttcccc ggcgcgcagc cccgacgggg ccgcggcagg cgcggcgaga gcgctgacgg 420agccatgaga gagtacaaag tggtggtgct gggctcgggc ggcgtgggca agtccgcgct 480caccgtgcag ttcgtgacgg gctccttcat cgagaagtac gacccgacca tcgaagactt 540ttaccgcaag gagattgagg tggactcgtc gccgtcggtg ctggagatcc tggatacggc 600gggcaccgag cagttcgcgt ccatgcggga cctgtacatc aagaacggcc agggcttcat 660cctggtctac agcctcgtca accagcagag cttccaggac atcaagccca tgcgggacca 720gatcatccgc gtgaagcggt acgagcgcgt gcccatgatc ctggtgggca acaaggtgga 780cctggagggt gagcgcgagg tctcgtacgg ggagggcaag gccctggctg aggagtggag 840ctgccccttc atggagacgt cggccaaaaa caaagcctcg gtagacgagc tatttgccga 900gatcgtgcgg cagatgaact acgcggcgca gcccaacggc gatgagggct gctgctcggc 960ctgcgtgatc ctctgaggcg gccaccgcgc gccggccgcg ctctgcgcac aaaagccaaa 1020cgcatccgac tctctaaatg tgatttattt cttgctttga gattggagac cactttgcat 1080tggccagggt gtcttgggag cccggctggc ctccgcggcc ggcgtcccct gcctccaccc 1140tgtgcccgag ggggtgtccg gtcctgccca tccgatactc tggtggaaat gtggctcttt 1200gcagcatgta cgtttctccc tgattttggt tgatgcatat ttccccgttt aagtagccgt 1260tagggcgcag tatcggcagc ttgacaccca ccaagcaaaa gtttcagcct ggaaaaaaaa 1320tgggggggaa gggtggatga aaaggaggga gagaaggtgg aaatggtttt tttttttttt 1380tttctatttt ctttcttttt tttttttttt ttttttggtc aacagccgtt tttctagttc 1440caagttttaa atacatggaa ggaagtccgg gagaaccata tgaaggagca ggaggagagg 1500aagaaacttt ttttccttct tttccaggag tagctggaaa ttaagatcgg gttccttttc 1560tgccagcttg gaagggcaac cccatgactg attgcgattc tgaggatgtc tatgcaaagt 1620tggattcttg ttacagtgta tccaatctga agtattgcac atctgaactg ggactgttaa 1680cactgatgcc aatacagtgt ggggtgccag aaagtgtctg ctgatatttg tgg 1733 67 1499PRT Homo sapiens 67 Met Glu Arg Glu Pro Ala Gly Thr Glu Glu Pro Gly ProPro Gly Arg 1 5 10 15 Arg Arg Arg Arg Glu Gly Arg Thr Arg Thr Val ArgSer Asn Leu Leu 20 25 30 Pro Pro Pro Gly Ala Glu Asp Pro Ala Ala Gly AlaAla Lys Gly Glu 35 40 45 Arg Arg Arg Arg Arg Gly Cys Ala Gln His Leu AlaAsp Asn Arg Leu 50 55 60 Lys Thr Thr Lys Tyr Thr Leu Leu Ser Phe Leu ProLys Asn Leu Phe 65 70 75 80 Glu Gln Phe His Arg Pro Ala Asn Val Tyr PheVal Phe Ile Ala Leu 85 90 95 Leu Asn Phe Val Pro Ala Val Asn Ala Phe GlnPro Gly Leu Ala Leu 100 105 110 Ala Pro Val Leu Phe Ile Leu Ala Ile ThrAla Phe Arg Asp Leu Trp 115 120 125 Glu Asp Tyr Ser Arg His Arg Ser AspHis Lys Ile Asn His Leu Gly 130 135 140 Cys Leu Val Phe Ser Arg Glu GluLys Lys Tyr Val Asn Arg Phe Trp 145 150 155 160 Lys Glu Ile His Val GlyAsp Phe Val Arg Leu Arg Cys Asn Glu Ile 165 170 175 Phe Pro Ala Asp IleLeu Leu Leu Ser Ser Ser Asp Pro Asp Gly Leu 180 185 190 Cys His Ile GluThr Ala Asn Leu Asp Gly Glu Thr Asn Leu Lys Arg 195 200 205 Arg Gln ValVal Arg Gly Phe Ser Glu Leu Val Ser Glu Phe Asn Pro 210 215 220 Leu ThrPhe Thr Ser Val Ile Glu Cys Glu Lys Pro Asn Asn Asp Leu 225 230 235 240Ser Arg Phe Arg Gly Cys Ile Ile His Asp Asn Gly Lys Lys Ala Gly 245 250255 Leu Tyr Lys Glu Asn Leu Leu Leu Arg Gly Cys Thr Leu Arg Asn Thr 260265 270 Asp Ala Val Val Gly Ile Val Ile Tyr Ala Gly His Glu Thr Lys Ala275 280 285 Leu Leu Asn Asn Ser Gly Pro Arg Tyr Lys Arg Ser Lys Leu GluArg 290 295 300 Gln Met Asn Cys Asp Val Leu Trp Cys Val Leu Leu Leu ValCys Met 305 310 315 320 Ser Leu Phe Ser Ala Val Gly His Gly Leu Trp IleTrp Arg Tyr Gln 325 330 335 Glu Lys Lys Ser Leu Phe Tyr Val Pro Lys SerAsp Gly Ser Ser Leu 340 345 350 Ser Pro Val Thr Ala Ala Val Tyr Ser PheLeu Thr Met Ile Ile Val 355 360 365 Leu Gln Val Leu Ile Pro Ile Ser LeuTyr Val Ser Ile Glu Ile Val 370 375 380 Lys Ala Cys Gln Val Tyr Phe IleAsn Gln Asp Met Gln Leu Tyr Asp 385 390 395 400 Glu Glu Thr Asp Ser GlnLeu Gln Cys Arg Ala Leu Asn Ile Thr Glu 405 410 415 Asp Leu Gly Gln IleGln Tyr Ile Phe Ser Asp Lys Thr Gly Thr Leu 420 425 430 Thr Glu Asn LysMet Val Phe Arg Arg Cys Thr Val Ser Gly Val Glu 435 440 445 Tyr Ser HisAsp Ala Asn Ala Gln Arg Leu Ala Arg Tyr Gln Glu Ala 450 455 460 Asp SerGlu Glu Glu Glu Val Val Pro Arg Gly Gly Ser Val Ser Gln 465 470 475 480Arg Gly Ser Ile Gly Ser His Gln Ser Val Arg Val Val His Arg Thr 485 490495 Gln Ser Thr Lys Ser His Arg Arg Thr Gly Ser Arg Ala Glu Ala Lys 500505 510 Arg Ala Ser Met Leu Ser Lys His Thr Ala Phe Ser Ser Pro Met Glu515 520 525 Lys Asp Ile Thr Pro Asp Pro Lys Leu Leu Glu Lys Val Ser GluCys 530 535 540 Asp Lys Ser Leu Ala Val Ala Arg His Gln Glu His Leu LeuAla His 545 550 555 560 Leu Ser Pro Glu Leu Ser Asp Val Phe Asp Phe PheIle Ala Leu Thr 565 570 575 Ile Cys Asn Thr Val Val Val Thr Ser Pro AspGln Pro Arg Thr Lys 580 585 590 Val Arg Val Arg Phe Glu Leu Lys Ser ProVal Lys Thr Ile Glu Asp 595 600 605 Phe Leu Arg Arg Phe Thr Pro Ser CysLeu Thr Ser Gly Cys Ser Ser 610 615 620 Ile Gly Ser Leu Ala Ala Asn LysSer Ser His Lys Leu Gly Ser Ser 625 630 635 640 Phe Pro Ser Thr Pro SerSer Asp Gly Met Leu Leu Arg Leu Glu Glu 645 650 655 Arg Leu Gly Gln ProThr Ser Ala Ile Ala Ser Asn Gly Tyr Ser Ser 660 665 670 Gln Ala Asp AsnTrp Ala Ser Glu Leu Ala Gln Glu Gln Glu Ser Glu 675 680 685 Arg Glu LeuArg Tyr Glu Ala Glu Ser Pro Asp Glu Ala Ala Leu Val 690 695 700 Tyr AlaAla Arg Ala Tyr Asn Cys Val Leu Val Glu Arg Leu His Asp 705 710 715 720Gln Val Ser Val Glu Leu Pro His Leu Gly Arg Leu Thr Phe Glu Leu 725 730735 Leu His Thr Leu Gly Phe Asp Ser Val Arg Lys Arg Met Ser Val Val 740745 750 Ile Arg His Pro Leu Thr Asp Glu Ile Asn Val Tyr Thr Lys Gly Ala755 760 765 Asp Ser Val Val Met Asp Leu Leu Gln Pro Cys Ser Ser Val AspAla 770 775 780 Arg Gly Arg His Gln Lys Lys Ile Arg Ser Lys Thr Gln AsnTyr Leu 785 790 795 800 Asn Val Tyr Ala Ala Glu Gly Leu Arg Thr Leu CysIle Ala Lys Arg 805 810 815 Val Leu Ser Lys Glu Glu Tyr Ala Cys Trp LeuGln Ser His Leu Glu 820 825 830 Ala Glu Ser Ser Leu Glu Asn Ser Glu GluLeu Leu Phe Gln Ser Ala 835 840 845 Ile Arg Leu Glu Thr Asn Leu His LeuLeu Gly Ala Thr Gly Ile Glu 850 855 860 Asp Arg Leu Gln Asp Gly Val ProGlu Thr Ile Ser Lys Leu Arg Gln 865 870 875 880 Ala Gly Leu Gln Ile TrpVal Leu Thr Gly Asp Lys Gln Glu Thr Ala 885 890 895 Val Asn Ile Ala TyrAla Cys Lys Leu Leu Asp His Asp Glu Glu Val 900 905 910 Ile Thr Leu AsnAla Thr Ser Gln Glu Ala Cys Ala Ala Leu Leu Asp 915 920 925 Gln Cys LeuCys Tyr Val Gln Ser Arg Gly Leu Gln Arg Ala Pro Glu 930 935 940 Lys ThrLys Gly Lys Val Ser Met Arg Phe Ser Ser Leu Cys Pro Pro 945 950 955 960Ser Thr Ser Thr Ala Ser Gly Arg Arg Pro Ser Leu Val Ile Asp Gly 965 970975 Arg Ser Leu Ala Tyr Ala Leu Glu Lys Asn Leu Glu Asp Lys Phe Leu 980985 990 Phe Leu Ala Lys Gln Cys Arg Ser Val Leu Cys Cys Arg Ser Thr Pro995 1000 1005 Leu Gln Lys Ser Met Val Val Lys Leu Val Arg Ser Lys LeuLys 1010 1015 1020 Ala Met Thr Leu Ala Ile Gly Asp Gly Ala Asn Asp ValSer Met 1025 1030 1035 Ile Gln Val Ala Asp Val Gly Val Gly Ile Ser GlyGln Glu Gly 1040 1045 1050 Met Gln Ala Val Met Ala Ser Asp Phe Ala ValPro Lys Phe Arg 1055 1060 1065 Tyr Leu Glu Arg Leu Leu Ile Leu His GlyHis Trp Cys Tyr Ser 1070 1075 1080 Arg Leu Ala Asn Met Val Leu Tyr PhePhe Tyr Lys Asn Thr Met 1085 1090 1095 Phe Val Gly Leu Leu Phe Trp PheGln Phe Phe Cys Gly Phe Ser 1100 1105 1110 Ala Ser Thr Met Ile Asp GlnTrp Tyr Leu Ile Phe Phe Asn Leu 1115 1120 1125 Leu Phe Ser Ser Leu ProPro Leu Val Thr Gly Val Leu Asp Arg 1130 1135 1140 Asp Val Pro Ala AsnVal Leu Leu Thr Asn Pro Gln Leu Tyr Lys 1145 1150 1155 Ser Gly Gln AsnMet Glu Glu Tyr Arg Pro Arg Thr Phe Trp Phe 1160 1165 1170 Asn Met AlaAsp Ala Ala Phe Gln Ser Leu Val Cys Phe Ser Ile 1175 1180 1185 Pro TyrLeu Ala Tyr Tyr Asp Ser Asn Val Asp Leu Phe Thr Trp 1190 1195 1200 GlyThr Pro Ile Val Thr Ile Ala Leu Leu Thr Phe Leu Leu His 1205 1210 1215Leu Gly Ile Glu Thr Lys Thr Trp Thr Trp Leu Asn Trp Ile Thr 1220 12251230 Cys Gly Phe Ser Val Leu Leu Phe Phe Thr Val Ala Leu Ile Tyr 12351240 1245 Asn Ala Ser Cys Ala Thr Cys Tyr Pro Pro Ser Asn Pro Tyr Trp1250 1255 1260 Thr Met Gln Ala Leu Leu Gly Asp Pro Val Phe Tyr Leu ThrCys 1265 1270 1275 Leu Met Thr Pro Val Ala Ala Leu Leu Pro Arg Leu PhePhe Arg 1280 1285 1290 Ser Leu Gln Gly Arg Val Phe Pro Thr Gln Leu GlnLeu Ala Arg 1295 1300 1305 Gln Leu Thr Arg Lys Ser Pro Arg Arg Cys SerAla Pro Lys Glu 1310 1315 1320 Thr Phe Ala Gln Gly Arg Leu Pro Lys AspSer Gly Thr Glu His 1325 1330 1335 Ser Ser Gly Arg Thr Val Lys Thr SerVal Pro Leu Ser Gln Pro 1340 1345 1350 Ser Trp His Thr Gln Gln Pro ValCys Ser Leu Glu Ala Ser Gly 1355 1360 1365 Glu Pro Ser Thr Val Asp MetSer Met Pro Val Arg Glu His Thr 1370 1375 1380 Leu Leu Glu Gly Leu SerAla Pro Ala Pro Met Ser Ser Ala Pro 1385 1390 1395 Gly Glu Ala Val LeuArg Ser Pro Gly Gly Cys Pro Glu Glu Ser 1400 1405 1410 Lys Val Arg AlaAla Ser Thr Gly Arg Val Thr Pro Leu Ser Ser 1415 1420 1425 Leu Phe SerLeu Pro Thr Phe Ser Leu Leu Asn Trp Ile Ser Ser 1430 1435 1440 Trp SerLeu Val Ser Arg Leu Gly Ser Val Leu Gln Phe Ser Arg 1445 1450 1455 ThrGlu Gln Leu Ala Asp Gly Gln Ala Gly Arg Gly Leu Pro Val 1460 1465 1470Gln Pro His Ser Gly Arg Ser Gly Leu Gln Gly Pro Asp His Arg 1475 14801485 Leu Leu Ile Gly Ala Ser Ser Arg Arg Ser Gln 1490 1495 68 12 DNAArtificial Sequence primer_bind (1)..(12) PCR primer 68 gatctgcggt ga 1269 24 DNA Artificial Sequence primer_bind (1)..(24) PCR primer 69agcactctcc agcctctcac cgca 24 70 12 DNA Artificial Sequence primer_bind(1)..(12) PCR primer 70 gatctgttca tg 12 71 24 DNA Artificial Sequenceprimer_bind (1)..(24) PCR primer 71 accgacgtcg actatccatg aaca 24 72 12DNA Artificial Sequence primer_bind (1)..(12) PCR primer 72 gatcttccctcg 12 73 24 DNA Artificial Sequence primer_bind (1)..(24) PCR primer 73aggcaactgt gctatccgag ggaa 24 74 20 DNA Artificial Sequence primer_bind(1)..(20) PCR primer 74 gctgctcaag ctcagaaacc 20 75 20 DNA ArtificialSequence primer_bind (1)..(20) PCR primer 75 ccctgccgtc tatttctttg 20 7619 DNA Artificial Sequence primer_bind (1)..(19) PCR primer 76tagtagctgg ggcagcaaa 19 77 20 DNA Artificial Sequence primer_bind(1)..(20) PCR primer 77 tggaagctcg gcttctttag 20 78 20 DNA ArtificialSequence primer_bind (1)..(20) PCR primer 78 acaaaagcct tgaggattgc 20 7920 DNA Artificial Sequence primer_bind (1)..(20) PCR primer 79aaaactgccg ttggcattag 20

We claim:
 1. An isolated nucleic acid molecule selected from the groupconsisting of: (a) a nucleic acid molecule which hybridizes understringent conditions to a molecule consisting of a nucleotide sequenceset forth as any of SEQ ID NO:1-11, and which codes for a polypeptidethat induces differentiation of a mesenchymal cell, (b) nucleic acidmolecules that differ from the nucleic acid molecules of (a) in codonsequence due to the degeneracy of the genetic code, and (c) complementsof (a) or (b).
 2. The isolated nucleic acid molecule of claim 1, whereinthe isolated nucleic acid molecule comprises the nucleotide sequence setforth as any of SEQ ID NO:1-11.
 3. The isolated nucleic acid molecule ofclaim 1, wherein the isolated nucleic acid molecule consists of a codingsequence of any nucleotide sequence set forth as any of SEQ ID NO:1-11.4. An isolated nucleic acid molecule selected from the group consistingof (a) unique fragments of a nucleotide sequence set forth as any of SEQID NO:1-11, and (b) complements of (a), provided that a unique fragmentof (a) includes a sequence of contiguous nucleotides which is notidentical to any sequence in the prior art and any complements orfragments thereof.
 5. The isolated nucleic acid molecule of claim 4,wherein the sequence of contiguous nucleotides is selected from thegroup consisting of: (1) at least two contiguous nucleotidesnonidentical to the sequence group, (2) at least three contiguousnucleotides nonidentical to the sequence group, (3) at least fourcontiguous nucleotides nonidentical to the sequence group, (4) at leastfive contiguous nucleotides nonidentical to the sequence group, (5) atleast six contiguous nucleotides nonidentical to the sequence group, and(6) at least seven contiguous nucleotides nonidentical to the sequencegroup.
 6. The isolated nucleic acid molecule of claim 4, wherein theunique fragment has a size selected from the group consisting of atleast: 8 nucleotides, 10 nucleotides, 12 nucleotides, 14 nucleotides, 16nucleotides, 18 nucleotides, 20, nucleotides, 22 nucleotides, 24nucleotides, 26 nucleotides, 28 nucleotides, 30 nucleotides, 50nucleotides, 75 nucleotides, 100 nucleotides, and 200 nucleotides. 7.The isolated nucleic acid molecule of claim 4, wherein the moleculeencodes a polypeptide which is immunogenic.
 8. An expression vectorcomprising the isolated nucleic acid molecule of claims 1, 2, 3, 4, 5,6, or 7 operably linked to a promoter.
 9. An expression vectorcomprising the isolated nucleic acid molecule of claim 4 operably linkedto a promoter.
 10. A host cell transformed or transfected with theexpression vector of claim
 8. 11. A host cell transformed or transfectedwith the expression vector of claim
 9. 12. An isolated polypeptideencoded by a nucleic acid molecule of claim 1, 2, 3, or 4, wherein thepolypeptide, or fragment of the polypeptide, induces differentiation ofa mesenchymal cell.
 13. The isolated polypeptide of claim 12, whereinthe polypeptide is encoded by a nucleic acid molecule of claim
 2. 14.The isolated polypeptide of claim 13, wherein the polypeptide comprisesa polypeptide having the sequence of amino acids 1-153 of SEQ ID NO:12.15. An isolated polypeptide encoded by a nucleic acid molecule of claim1, 2, 3, or 4, wherein the polypeptide, or fragment of the polypeptide,is immunogenic.
 16. The isolated polypeptide of claim 15, wherein thefragment of the polypeptide, or portion of the fragment, binds to ahuman antibody.
 17. An isolated binding polypeptide which bindsselectively a polypeptide encoded by an isolated nucleic acid moleculeof claim 1, 2, 3, or
 4. 18. The isolated binding polypeptide of claim17, wherein the isolated binding polypeptide binds to a polypeptidehaving the sequence of amino acids of SEQ ID NO:12.
 19. The isolatedbinding polypeptide of claim 18, wherein the isolated bindingpolypeptide is an antibody or an antibody fragment selected from thegroup consisting of a Fab fragment, a F(ab)₂ fragment or a fragmentincluding a CDR3 region.
 20. A method for determining the level of anyof SEQ ID NO:1-11 expression in a subject, comprising measuringexpression of any of SEQ ID NO:1-11 in a test sample from the subject todetermine the level of any of SEQ ID NO:1-11 expression in the subject.21. The method of claim 20, wherein the measured expression of any ofSEQ ID NO:1-11 in the test sample is compared to expression of any ofSEQ ID NO:1-11, respectively, in a control containing a known level ofexpression.
 22. The method of claim 20, wherein the expression of any ofSEQ ID NO:1-11 is mRNA expression.
 23. The method of claim 20, whereinthe expression of any of SEQ ID NO:1-11 is polypeptide expression. 24.The method of claim 20, wherein the test sample is tissue.
 25. Themethod of claim 20, wherein the test sample is a biological fluid. 26.The method of claim 22, wherein said mRNA expression is measured usingPCR.
 27. The method of claim 22, wherein said mRNA expression ismeasured using Northern blotting.
 28. The method of claim 23, whereinsaid polypeptide expression is measured using monoclonal antibodies toany of SEQ ID NO:1-11 expression products thereof.
 29. The method ofclaim 23, wherein said polypeptide expression is measured usingpolyclonal antisera to any of SEQ ID NO:1-11 expression productsthereof.
 30. The method of claim 23, wherein expression of any of SEQ IDNO:1-11, or expression products thereof, is measured using mesenchymalcell differentiation induction activity of any of SEQ ID NO:1-11, orexpression products thereof.
 31. A method for identifying an agentuseful in modulating mesenchymal cell differentiation induction activityof a molecule, comprising: (a) contacting a molecule having mesenchymalcell differentiation induction activity with a candidate agent, (b)measuring mesenchymal cell differentiation induction activity of themolecule, and (c) comparing the measured mesenchymal celldifferentiation induction activity of the molecule to a control todetermine whether the candidate agent modulates mesenchymal celldifferentiation induction activity of the molecule, wherein the moleculeis a nucleic acid molecule selected from the group consisting of SEQ IDNO:1-11, and 13-66, or an expression product thereof.
 32. A method ofdiagnosing a condition characterized by aberrant expression of a nucleicacid molecule or an expression product thereof, said method comprising:a) contacting a biological sample from a subject with an agent, whereinsaid agent specifically binds to said nucleic acid molecule, anexpression product thereof, or a fragment of an expression productthereof; and b) measuring the amount of bound agent and determiningtherefrom if the expression of said nucleic acid molecule or of anexpression product thereof is aberrant, aberrant expression beingdiagnostic of the condition; wherein the nucleic acid molecule is atleast one nucleic acid molecule selected from the group consisting ofSEQ ID NO:1-11, and 13-66.
 33. The method of claim 32, wherein thenucleic acid molecule is at least two nucleic acid molecules, eachselected from the group consisting of SEQ ID NO:1-11, and 13-66.
 34. Themethod of claim 32, wherein the nucleic acid molecule is at least threenucleic acid molecules, each selected from the group consisting of SEQID NO:1-11, and 13-66.
 35. The method of claim 32, wherein the nucleicacid molecule is at least four nucleic acid molecules, each selectedfrom the group consisting of SEQ ID NO:1-11, and 13-66.
 36. The methodof claim 32, wherein the nucleic acid molecule is at least five nucleicacid molecules, each selected from the group consisting of SEQ IDNO:1-11, and 13-66.
 37. The method of claim 32, wherein the conditioninvolves cartilaginous tissue degeneration selected from the groupconsisting of osteoarthritis, rheumatoid arthritis, gout arthritis,adjuvant arthritis, arthritis deformans, infectious arthritis, andosteochondrosis.
 38. The method of claim 32, wherein the condition isosteoarthritis.
 39. A method for determining regression, progression oronset of a cartilaginous tissue degeneration condition in a subjectcharacterized by aberrant expression of a nucleic acid molecule or anexpression product thereof, comprising: monitoring a sample from apatient, for a parameter selected from the group consisting of (i) anucleic acid molecule selected from the group consisting of SEQ IDNO:1-11, and 13-66, (ii) a polypeptide encoded by the nucleic acid,(iii) a peptide derived from the polypeptide, and (iv) an antibody whichselectively binds the polypeptide or peptide, as a determination ofregression, progression or onset of said cartilaginous tissuedegeneration condition in the subject.
 40. The method of claim 39,wherein the sample is a biological fluid or a tissue.
 41. The method ofclaim 39, wherein the step of monitoring comprises contacting the samplewith a detectable agent selected from the group consisting of (a) anisolated nucleic acid molecule which selectively hybridizes understringent conditions to the nucleic acid molecule of (i), (b) anantibody which selectively binds the polypeptide of (ii), or the peptideof (iii), and (c) a polypeptide or peptide which binds the antibody of(iv).
 42. The method of claim 41, wherein the antibody, the polypeptide,the peptide or the nucleic acid is labeled with a radioactive label oran enzyme.
 43. The method of claim 39, comprising assaying the samplefor the peptide.
 44. A kit, comprising a package containing: an agentthat selectively binds to the isolated nucleic acid of claim 1 or anexpression product thereof, and a control for comparing to a measuredvalue of binding of said agent to said isolated nucleic acid of claim 1or expression product thereof.
 45. The kit of claim 44, wherein thecontrol is a predetermined value for comparing to the measured value.46. The kit of claim 44, wherein the control comprises an epitope of theexpression product of the nucleic acid of claim
 1. 47. The kit of claim44, further comprising a second agent that selectively binds to anisolated nucleic acid molecule of claim 1 or an expression productthereof, and a control for comparing to a measured value of binding ofsaid second agent to said nucleic acid molecule or expression productthereof.
 48. A method for treating a cartilaginous tissue degenerationcondition, comprising: administering to a subject in need of suchtreatment an agent that modulates expression of a molecule selected fromthe group consisting of SEQ ID NO:1-67, in an amount effective to treatthe cartilaginous tissue degeneration condition.
 49. The method of claim48, wherein the cartilaginous tissue degeneration condition is selectedfrom the group consisting of osteoarthritis, rheumatoid arthritis, goutarthritis, adjuvant arthritis, arthritis deformans, infectiousarthritis, and osteochondrosis.
 50. The method of claim 48, furthercomprising co-administering an agent selected from the group consistingof an osteogenic protein, Insulin-like Growth Factor, TransformingGrowth Factor-β, and a proteoglycan.
 51. A method for treating a subjectto reduce the risk of a cartilaginous tissue degeneration conditiondeveloping in the subject, comprising: administering to a subject who isknown to express decreased levels of a molecule selected from the groupconsisting of SEQ ID NO:1-67, an agent for reducing the risk ofcartilaginous tissue degeneration condition in an amount effective tolower the risk of the subject developing a future cartilaginous tissuedegeneration condition, wherein the agent is selected from the groupconsisting of an osteogenic protein, Insulin-like Growth Factor,Transforming Growth Factor-β, and a proteoglycan, or an agent thatmodulates expression of a molecule selected from the group consisting ofconsisting of SEQ ID NO:1-67.
 52. A method for identifying a candidateagent useful in the treatment of a cartilaginous tissue degenerationcondition, comprising: determining expression of a set of nucleic acidmolecules in a cell of mesenchymal origin or cartilaginous tissue underconditions which, in the absence of a candidate agent, permit a firstamount of expression of the set of nucleic acid molecules, wherein theset of nucleic acid molecules comprises at least one nucleic acidmolecule selected from the group consisting of SEQ ID NO:1-11, and13-66, contacting the cell of mesenchymal origin or cartilaginous tissuewith the candidate agent, and detecting a test amount of expression ofthe set of nucleic acid molecules, wherein an increase in the testamount of expression in the presence of the candidate agent relative tothe first amount of expression indicates that the candidate agent isuseful in the treatment of the cartilaginous tissue degenerationcondition.
 53. The method of claim 52, wherein the cartilaginous tissuedegeneration condition is selected from the group consisting ofosteoarthritis, rheumatoid arthritis, gout arthritis, adjuvantarthritis, arthritis deformans, infectious arthritis, andosteochondrosis.
 54. The method of claim 52, wherein the condition isosteoarthritis.
 55. The method of claim 52, wherein the set of nucleicacid molecules comprises at least two nucleic acid molecules, eachselected from the group consisting of SEQ ID NO:1-11, and 13-66.
 56. Apharmaceutical composition, comprising: an agent comprising an isolatednucleic acid molecule selected from the group consisting of SEQ IDNO:1-11, and 13-66, or an expression product thereof, in apharmaceutically effective amount to treat a cartilaginous tissuedegeneration condition, and a pharmaceutically acceptable carrier. 57.The pharmaceutical composition of claim 56, wherein the agent is anexpression product of the isolated nucleic acid molecule selected fromthe group consisting of SEQ ID NO:1-11, and 13-66.
 58. Thepharmaceutical composition of claim 57, wherein the cartilaginous tissuedegeneration condition is selected from the group consisting ofosteoarthritis, rheumatoid arthritis, gout arthritis, adjuvantarthritis, arthritis deformans, infectious arthritis, andosteochondrosis.
 59. A solid-phase nucleic acid molecule arrayconsisting essentially of a set of nucleic acid molecules, expressionproducts thereof, or fragments thereof, each nucleic acid moleculeselected from the group consisting of SEQ ID NO:1-11, and 13-66, fixedto a solid substrate.
 60. The solid-phase nucleic acid molecule array ofclaim 59, further comprising at least one control nucleic acid molecule.61. The solid-phase nucleic acid molecule array of claim 59, wherein theset of nucleic acid molecules comprises at least one nucleic acidmolecule selected from the group consisting of SEQ ID NO:1-11, and13-66.